SMOPE-Net

We introduce the Simultaneous Multiple Object detection and Pose Estimation Network (SMOPE-Net) that performs multi-target detection and pose estimation tasks in an end-to-end manner. SMOPE-Net extracts the object model features and fuses the model features with the image features to infer the object categories, 2-D detection boxes, poses, and visibility. We perform experiments and comparisons to existing methods on multiple datasets including the new KITTI-3D dataset and the existing LineMod-O datasets. And our method achieves better than existing methods forpose estimation on both datasets.

SMOPE-Net
- Results
- RoadMap
- SMOPE-Net
- Installation
  - Requirements
  - Compiling CUDA operators
- Preparation
- KITTI-3D
  - Test
  - Train
- LineMod-O
  - Test
  - Train
- Citation
- Acknowledge
- License

Results

SMOPE-Net on KITTI-3D dataset

Comparison results on KITTI-3D dataset

SMOPE-Net on LineMod-O dataset

SMOPE-Net

Schematics of end-to-end trainable SMOPE-Net: The network expects images and $N_m$ 3D object models as input. The Deformable-DETR (D-DETR) block provides a 256-dimensional feature vector for each of the $N_q$ queries. It also provides detected bounding boxes (Bboxes) for the input image under the loss $\mathcal{L}_{D}$ . A 3D Encoder learns a 256-dimensional latent space from the 3D models. The features of ${N}_{m}$ models in this space are used by the 3D Decoder to estimate model points, scales and centers to reconstruct the ${N}_{m}$ models under the loss $\mathcal{L}_{M}$ . The $N_m \times 256$ latent features are also used by the 3D Attention module to compute attention maps for the queries, and 3D Model Pose module to subsequently predict model class and object 6DoF pose estimates. Both components are used for computing the $\mathcal{L}_{p}$ loss.

Installation

Requirements

Linux, CUDA>=10.0, GCC>=5.4
Python>=3.8

We recommend you to use Anaconda to create a conda environment:
```
conda create -n detr_6dof python=3.8 pip
```
Then, activate the environment:
```
conda activate detr_6dof
```
PyTorch>=1.9.1 (following instructions here)

For example, if your CUDA version is 10.2, you could install pytorch and torchvision as following:
```
conda install -c pytorch pytorch=1.9.1 torchvision cudatoolkit=10.2
```
PyTorch3D (following instructions here)
- Using Anaconda Cloud,on Linux only
```
conda install pytorch3d -c pytorch3d
```
- Installing From Source（Install from GitHub）.
```
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
```
Other requirements
```
pip install -r requirements.txt
```

Compiling CUDA operators

cd ./models/ops
sh ./make.sh

Preparation

Clone this repository
```
git clone <repository url>
```

KITTI-3D

Download the dataset from here.
Run the creat_dataset.py

cd dataset_labelImg3d
python creat_dateset.py

Test

Download the network weights file

Modify the <project path>/configs/__init__.py to

7. configure_name = "config.json"
8. # configure_name = 'config_linemod.json'

Modify the <project path>/configs/config.json

1.    {
2.       "dataset_name": "KITTI3D",
3.       "dataset_path": <your downloaded KITTI-3D folder>,
4.		 "poses": true,
5.		 "eval": true,
    ...
11.		 "output_dir": "../output_dir_pose",
	...
76.		 "train": {
	...		
79.		 	"resume": <your downloaded weights>,

Activate your python environment, and run
```
cd <project folder>
python main.py
```

Train

Modify the <project path>/configs/__init__.py to

7. configure_name = "config.json"
8. # configure_name = 'config_linemod.json'

Modify the <project path>/configs/config.json

1.    {
2.       "dataset_name": "KITTI3D",
3.       "dataset_path": <your downloaded KITTI-3D folder>,
4.		 "poses": true,
5.		 "eval": false,
    ...
11.		 "output_dir": "../output_dir_pose",<your save folder>
	...
76. 	"train": {    
77.        	"start_epoch": 0,
78.    		"end_epoch": 1000,
79.    		"resume": "",
80.    		"batch_size": 4 <you can change it according to your gpu capability>,
    ...

Activate your python environment, and run
```
cd <project folder>
python main.py
```

LineMod-O

Download the training and testing dataset from here

Test

Download the network weights file

Modify the <project path>/configs/__init__.py to

7. # configure_name = "config.json"
8. configure_name = 'config_linemod.json'

Modify the <project path>/configs/config_linemod.json

1.    {
2.       "dataset_name": "Linemod_preprocessed",
3.       "dataset_path": <your downloaded KITTI-3D folder>/02,
4.		 "poses": true,
5.		 "eval": true,
    ...
11.		 "output_dir": "../output_dir_pose",
	...
76.		 "train": {
	...		
79.		 	"resume": <your downloaded weights>,
    ...

Activate your python environment, and run
```
cd <project folder>
python main.py
```

Train

Modify the <project path>/configs/__init__.py to

7. # configure_name = "config.json"
8. configure_name = 'config_linemod.json'

Modify the <project path>/configs/config_linemod.json

1.    {
2.       "dataset_name": "Linemod_preprocessed",
3.       "dataset_path": <your downloaded Linemod_preprocessed folder>/02,
4.		 "poses": true,
5.		 "eval": false,
    ...
11.		 "output_dir": "../output_dir_pose",<your save folder>
	...
81. 	"train": {    
82.        	"start_epoch": 0,
83.    		"end_epoch": 1000,
84.    		"resume": "",
85.    		"batch_size": 4 <you can change it according to your gpu capability>,
    ...

activate your python environment, and run
```
cd <project folder>
python main.py
```

Citation

Acknowledge

This work is based on the Pytroch, Pytorch3D and Deformable-DETR. It also inspired by DETR.

License

The methods provided on this page are published under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License . This means that you must attribute the work in the manner specified by the authors, you may not use this work for commercial purposes and if you alter, transform, or build upon this work, you may distribute the resulting work only under the same license. If you are interested in commercial usage you can contact us for further options.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
configs		configs
d2		d2
datasets		datasets
datasets_labelimg3d		datasets_labelimg3d
images		images
models		models
pred		pred
tools		tools
util		util
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
engine.py		engine.py
hubconf.py		hubconf.py
main.py		main.py
requirements.txt		requirements.txt
run_with_submitit.py		run_with_submitit.py
test_all.py		test_all.py
tox.ini		tox.ini

License

SMOPE-Net/SMOPE-NET

Folders and files

Latest commit

History

Repository files navigation

Results

SMOPE-Net

Installation

Requirements

Compiling CUDA operators

Preparation

KITTI-3D

Test

Train

LineMod-O

Test

Train

Citation

Acknowledge

License

About

Resources

License

Stars

Watchers

Forks

Languages