Skip to content

MarcoParola/conditioning-transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

conditioning-transformer

size license Python

Design of a transformer-based architecture for object detection conditioned by metadata:

  • DEtection TRanformer (DETR)
  • Vision Transformer (ViT) ???

Metadata integration strategies

We develop the following strategies to incorporate metadata information into image processing:

  • Baseline (no metadata)
  • Early concatenation
  • Early summation

Installation

To install the project, simply clone the repository and get the necessary dependencies:

git clone https://github.com/MarcoParola/conditioning-transformer.git
cd conditioning-transformer
mkdir models data

Create and activate virtual environment, then install dependencies.

python -m venv env
. env/bin/activate
python -m pip install -r requirements.txt 

Next, create a new project on Weights & Biases. Log in and paste your API key when prompted.

wandb login 

Usage

To perform a training run by setting model parameter:

python train.py model=detr

model can assume the following value detr, early-sum-detr, early-concat-detr, early-shift-detr.

To run inference on test set to compute some metrics, specify the weight model path by setting weight parameter (I ususally download it from wandb and I copy it in checkpoint folder).

python test.py model=detr weight=checkpoint/best.pt

Acknowledgement

Special thanks to @clive819 for making an implementation of DETR public here. Special thanks to @hustvl for YOLOS original implementation

About

Transformer architecture conditioned by metadata

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages