Natural Language Processing - Cristiano De Nobili

This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP). It is about Natural Language processing, in particular on recent progress involving transformers-based models. I must thank the innovative-startup AINDO for the support.

Cristiano holds a Ph.D. in Theoretical Physics (SISSA) and he has been actively working in Deep Learning for four years. In particular, he is now part of the Bixby project, Samsung's vocal assistant. He is also a TEDx speaker (here is talk about AI, Humans and their future) and civil pilot (PPL). Here his contacts:

If you are interested in science and tech news: LinkedIn & Twitter;
On my website you can find all my lectures, workshops, and talks;
My Instagram is about flying, traveling, and adventure. It is the social platform that I use the most.

Have also a look at the first part of the course, Introduction to Neural Networks (with PyTorch), by Alessio Ansuini, and the third part, Deep generative models with TensorFlow 2, by Piero Coronica.

Course Outline

You can find here the videos of the lectures. For this year, I decided to use PyTorch as the main Deep Learning library.

Lecture 1: intro to NLP, text preprocessing, spaCy, common problems in NLP (NER, POS, sentence classification, ...), non-contextual word embedding, SkipGram Word2Vec coded from scratch, pre-trained Glove with Gensim, intro to contextual word embedding and (self-)Attention Mechanism.
Lecture 2: transfer learning main concepts, transformer-based model, how BERT-like models are trained and fine-tuned on downstream tasks, intro to Transformers library Hugging Face, tokenization, language modeling with English and non-English (Italian Gilberto and Umberto) pre-trained AutoModels, some examples of NLP problems using Transformers Pipeline.
Lecture 3: fine-tune a pre-trained Italian RoBERTa to solve word-sense disambiguation, embedding geometry, clustering (TSNE and UMAP) and visualization (this lecture is a bit advanced). Part of this notebook is done using PyTorch Lightning.

Useful links and references are inside each notebook. For any doubts or questions feel free to contact me!

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
README.md		README.md
lecture_1_intro_to_nlp.ipynb		lecture_1_intro_to_nlp.ipynb
lecture_2_intro_huggingface_transformers.ipynb		lecture_2_intro_huggingface_transformers.ipynb
lecture_3_wordsensedisamb_ita_roberta.ipynb		lecture_3_wordsensedisamb_ita_roberta.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

lecture_1_intro_to_nlp.ipynb

lecture_1_intro_to_nlp.ipynb

lecture_2_intro_huggingface_transformers.ipynb

lecture_2_intro_huggingface_transformers.ipynb

lecture_3_wordsensedisamb_ita_roberta.ipynb

lecture_3_wordsensedisamb_ita_roberta.ipynb

Repository files navigation

Natural Language Processing - Cristiano De Nobili

Course Outline

About

Releases

Packages

Languages

denocris/MHPC-Natural-Language-Processing-Lectures

Folders and files

Latest commit

History

Repository files navigation

Natural Language Processing - Cristiano De Nobili

Course Outline

About

Topics

Resources

Stars

Watchers

Forks

Languages