Skip to content

MohammadJRanjbar/Natural-Language-Processing-course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Natural-Language-Processing-course Machine Learning

Welcome to the Natural Language Processing course repository offered at the University of Tehran. This repository contains code for assignments and projects completed during the course. The course by:

Course Description

This NLP course offers a comprehensive curriculum covering various essential topics. Students will learn the basics of text processing, including Regular Expressions and Text Normalization for pattern matching and text preparation. They will explore Morphology to understand word structure, Tokenization for breaking text into meaningful units, and Edit Distance and Spell Correction techniques for identifying and correcting spelling errors.

The course includes practical applications such as Language Modeling with N-Grams, Naive Bayes Classification, and Sentiment Analysis. Students will also delve into Logistic Regression, gaining valuable insights into its applications in NLP.

Further, the course delves into Lexical and Vector Semantics, helping students understand word meaning and relationships. Advanced topics like Neural Nets and Neural Language Models, Sequence Labeling for Parts of Speech and Named Entities, and Deep Learning Architectures for Sequence Processing will equip students with modern NLP techniques.

Word Senses and WordNet will enable students to work with word sense disambiguation, and Encoder-Decoder models, attention, and LSTM will be taught for sequence-to-sequence tasks. Transformers and Contextual Word Embeddings will be covered, along with Transforms and Transfer Learning using models like MBERT, XLMR, GPT, T5, and MT5.

The course will also touch on Statistical Machine Translation and Neural Machine Translation, as well as Constituency Grammars, Parsing, and Dialogue Systems including chatbots. Additionally, Information Extraction (NER, RE), Question Answering, and Logical Representations of Sentence Meaning will be explored, offering a comprehensive understanding of NLP applications and techniques.

Table of Contents

Please find below a brief overview of the contents of this repository:

  1. HW1/: This directory contains code for Assignment 1, which focuses on n-grams and different methods of tokenization.
  2. HW2/: This directory contains code for Assignment 2, which focuses on Sentiment analysis using Naive Bayes and Logistic Regression, and training word2vec.
  3. HW3/: This directory contains code for Assignment 3, which focuses on Sentiment analysis using LSTM, RNN, and GRU.
  4. HW4/: This directory contains code for Assignment 4, which focuses on Zero-shot learning and fine-tuning ParsBERT for the task of natural language inference on FarsTail.
  5. HW5/: This directory contains code for Assignment 5, which focuses on machine translation using Fairseq. We train this model.
  6. HW6/: This directory contains code for Assignment 6, which focuses on training a chatbot. In this task, we train Rasa bot to answer FAQ questions for a ticket-selling company.

Disclaimer

This repository is for archival and reference purposes only. The code here might not be updated or maintained. Use it at your own discretion.

About

Assignments and projects from the interpretable natural language processing course offered at the University of Tehran.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages