MedICap: A Concise Model for Medical Image Captioning

MedICap is a medical image captioning model that placed first in the ImageCLEFmedical Caption 2023 challenge: https://www.imageclef.org/2023/medical/caption (team CSIRO). It is available on the Hugging Face Hub: https://huggingface.co/aehrc/medicap. It is presented in the working notes and at CLEF 2023.

Working notes:

https://www.dei.unipd.it/~faggioli/temp/CLEF2023-proceedings/paper-132.pdf

BibTeX:

@inproceedings{nicolson_aehrc_2021,
	address = {Thessaloniki, Greece},
	title = {A {C}oncise {M}odel for {M}edical {I}mage {C}aptioning},
	copyright = {All rights reserved},
	language = {en},
	booktitle = {Proceedings of the 14th {International} {Conference} of the {CLEF} {Association}},
	author = {Nicolson, Aaron and Dowling, Jason and Koopman, Bevan},
	month = sep,
	year = {2023},
}


Decoder conditioned on the visual features of the image via A) the cross-attention, and B) the self-attention. The visual features are extracted with the encoder. CC BY [Muacevic et al. (2022)]. 𝑁 is the number of Transformer blocks. `[BOS]` is the beginning-of-sentence special token.

Hugging Face model & checkpoint:

The Hugging Face model & checkpoint is available at: https://huggingface.co/aehrc/medicap.

Notebook example:

An example of MedICap generating captions is given in example.ipynb.

Installation:

After cloning the repository, install the required packages in a virtual environment. The required packages are located in requirements.txt:

python -m venv --system-site-packages venv
source venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade -r requirements.txt --no-cache-dir

Test the Hugging Face checkpoints:

To test the Hugging Face model:

dlhpcstarter -t imageclefmed_caption_2023_hf -c config/test_huggingface/007_no_ca_scst.yaml --stages_module tools.stages --test

See dlhpcstarter==0.1.4 for more options.

Note: data will be saved in the experiment directory (exp_dir in the configuration file).

Training:

To train with teacher forcing:

dlhpcstarter -t imageclefmed_caption_2023 -c config/train/002_no_ca.yaml --stages_module tools.stages --train

The model can then be tested with the --test flag:

dlhpcstarter -t imageclefmed_caption_2023 -c config/train/002_no_ca.yaml --stages_module tools.stages --test

To then train with Self-Critical Sequence Training (SCST) with the BERTScore reward:

Copy the path to the checkpoint from the exp_dir for the configuration above, then paste it in the configuration for SCST as warm_start_ckpt_path, then:

dlhpcstarter -t mimic_cxr -c config/train/007_no_ca_scst.yaml --stages_module tools.stages --train

See dlhpcstarter==0.1.4 for more options.

Help/Issues:

If you need help, or if there are any issues, please leave an issue and we will get back to you as soon as possible.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
config		config
docs		docs
modules		modules
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.ipynb		example.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

docs

docs

modules

modules

tools

tools

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

example.ipynb

example.ipynb

requirements.txt

requirements.txt

Repository files navigation

MedICap: A Concise Model for Medical Image Captioning

Working notes:

BibTeX:

Hugging Face model & checkpoint:

Notebook example:

Installation:

Test the Hugging Face checkpoints:

Training:

Help/Issues:

About

Languages

License

aehrc/imageclefmedical_caption_23

Folders and files

Latest commit

History

Repository files navigation

MedICap: A Concise Model for Medical Image Captioning

Working notes:

BibTeX:

Hugging Face model & checkpoint:

Notebook example:

Installation:

Test the Hugging Face checkpoints:

Training:

Help/Issues:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages