Skip to content

piEsposito/transformers-low-code-experiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformers for Data Scientists in a rush

Low-code pre-built pipelines for experiments with huggingface/transformers for Data Scientists in a rush.


This repository contains low-code, easy to understand, pre-built pipelines for fast experimentation on NLP tasks using huggingface/transformers pre-trained language models, which are explained and explored in a post series on Medium about the theme.

This was inspired by a LinkedIn post of Thomas Wolf, HuggingFace's CSO in which there was an image of a low-code pipeline for fast experimentation on their Transformers repo. As I could not see anything like it implemented on the internet, I've decided to do it myself.

Index

As of now, we have:

Classification

On the classification example, we use a dataset for email spam classification from Kaggle, and use optuna for hyperparameter tuning.

You might run it, on the classification directory, with:

python classification-experiment.py --model-name bert-base-multilingual-cased ---metric f1_score --train-data-path train.csv --test-data-path test.csv --max-sequence-length 25 --label-nbr 2

It should yield a f1_score higher than 0.9.


Made by Pi Esposito

About

Low-code pre-built pipelines for experiments with huggingface/transformers for Data Scientists in a rush.

Topics

Resources

License

Stars

Watchers

Forks

Languages