Multimodal Industrial Anomaly Detection via Hybrid Fusion

The pipeline of Multi-3D-Memory (M3DM). Our M3DM contains three important parts: (1) Point Feature Alignment (PFA) converts Point Group features to plane features with interpolation and project operation, $\text{FPS}$ is the farthest point sampling and $\mathcal F_{pt}$ is a pretrained Point Transformer; (2) Unsupervised Feature Fusion (UFF) fuses point feature and image feature together with a patch-wise contrastive loss $\mathcal L_{con}$, where $\mathcal F_{rgb}$ is a Vision Transformer, $\chi_{rgb},\chi_{pt}$ are MLP layers and $\sigma_r, \sigma_p$ are single fully connected layers; (3) Decision Layer Fusion (DLF) combines multimodal information with multiple memory banks and makes the final decision with 2 learnable modules $\mathcal D_a, \mathcal D_s$ for anomaly detection and segmentation, where $\mathcal{M}{rgb}, \mathcal{M}{fs}, \mathcal{M}_{pt}$ are memory banks, $\phi, \psi$ are score function for single memory bank detection and segmentation, and $\mathcal{P}$ is the memory bank building algorithm.

Setup

We implement this repo with the following environment:

Python 3.8
Pytorch 1.9.0
CUDA 11.3

Install the other package via:

pip install -r requirement.txt
# install knn_cuda
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
# install pointnet2_ops_lib
pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"

Data Download and Preprocess

Dataset

The MVTec-3D AD dataset can be download from the Official Website of MVTec-3D AD.
The Eyecandies dataset can be download from the Official Website of Eyecandies.

After download, put the dataset in dataset folder.

Datapreprocess

To run the preprocessing

python utils/preprocessing.py datasets/mvtec3d/

It may take a few hours to run the preprocessing.

Checkpoints

The following table lists the pretrain model used in M3DM:

Backbone	Pretrain Method
Point Transformer	Point-MAE
Point Transformer	Point-Bert
ViT-b/8	DINO
ViT-b/8	Supervised ImageNet 1K
ViT-b/8	Supervised ImageNet 21K
ViT-s/8	DINO

Put the checkpoint files in checkpoints folder.

Train and Test

Train and test the double lib version and save the feature for UFF training:

python3 main.py \
--method_name DINO+Point_MAE \
--memory_bank multiple \
--rgb_backbone_name vit_base_patch8_224_dino \
--xyz_backbone_name Point_MAE \
--save_feature True \

Train the UFF:

OMP_NUM_THREADS=1 python3 -m torch.distributed.launch --nproc_per_node=1 fusion_pretrain.py    \
--accum_iter 16 \
--lr 0.003 \
--batch_size 16 \
--data_path datasets/patch_lib \
--output_dir checkpoints \

Train and test the full setting with the following command:

python3 main.py \
--method_name DINO+Point_MAE+Fusion \
--use_uff \
--memory_bank multiple \
--rgb_backbone_name vit_base_patch8_224_dino \
--xyz_backbone_name Point_MAE \
--fusion_module_path checkpoints/{FUSION_CHECKPOINT}.pth \

Note: if you set --method_name DINO or --method_name Point_MAE, set --memory_bank single at the same time.

If you find this repository useful for your research, please use the following.

@misc{wang2023multimodal,
  title={Multimodal Industrial Anomaly Detection via Hybrid Fusion},
  author={Wang, Yue and Peng, Jinlong and Zhang, Jiangning and Yi, Ran and Wang, Yabiao and Wang, Chengjie},
  year={2023},
  eprint={2303.00601},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Thanks

Our repo is built on 3D-ADS and MoCo-v3, thanks their extraordinary works!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
feature_extractors		feature_extractors
figures		figures
models		models
utils		utils
License_M3DM.txt		License_M3DM.txt
README.md		README.md
dataset.py		dataset.py
engine_fusion_pretrain.py		engine_fusion_pretrain.py
fusion_pretrain.py		fusion_pretrain.py
m3dm_runner.py		m3dm_runner.py
main.py		main.py
requirements.txt		requirements.txt

TencentYoutuResearch/AnomalyDetection-M3DM

Folders and files

Latest commit

History

Repository files navigation

Multimodal Industrial Anomaly Detection via Hybrid Fusion

Setup

Data Download and Preprocess

Dataset

Datapreprocess

Checkpoints

Train and Test

Thanks

About

Resources

Stars

Watchers

Forks

Languages