Prompting Techniques for NLG in the Medical Domain

Generating automatic explanations for the predictions of machine learning models has long since been a major challenge in Artificial Intelligence, especially when considering sensible domains like healthcare. In this thesis, we approach the problem of generating a fluent, natural language justification for a medical diagnosis given all the information in the case description and the disease's symptomatology. We treat this problem as a data-to-text task and solve it using prompting techniques for natural language generation. In particular, we propose two architectural modifications to standard Prefix Tuning called Layer Dependency and Prefix Pooling; we evaluate their results, comparing with current state-of-the-art methods for the data-to-text task, on both a general-purpose benchmark (WebNLG) and on a dataset of clinical cases and relative explanations built as part of the ANTIDOTE project.

Validation BLEU on WebNLG for model t5-base

Results show that layer dependency boosts the generation capabilities of our models when training on a limited computational budget, while Prefix Pooling is a valid dynamic prompting technique that achieves performances on par with the current state-of-the-art without requiring any additional information to be associated with the input. Finally, we note how, for our domain and in the context of the ANTIDOTE project, interpreting the explanation-generation task as data-to-text is a viable approach which produces high-quality justifications.

Note that all our experiments were tracked with Weigths & Biases and can be publicly accessed here.

Requirements

The code has been tested with Python 3.10.4.
In order to install the required python packages it's recommended to (i) create a virtual environment, (ii) install torch following the official instructions, and (iii) simply run the below script to install the remaining requirements.

  pip install -U pip
  pip install -r requirements.txt

Usage

In order to train and compute validation and test BLEU score of a model you can run

 python main.py path/to/your/config_file.yaml

where config_file.yaml is a YACS configuration file following those in the config folder. Note that, if during training or evaluation you want to use wandb to log your metrics you also need to run wandb login before the commands above in order to login to your account.

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
assets		assets
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

data

data

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

Repository files navigation

Prompting Techniques for NLG in the Medical Domain

Requirements

Usage

About

Releases

Packages

Languages

License

mwritescode/data2text-prompting

Folders and files

Latest commit

History

Repository files navigation

Prompting Techniques for NLG in the Medical Domain

Requirements

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Languages