Skip to content

priyammaz/HAL-DL-From-Scratch

Repository files navigation

Deep Learning from End to End

banner


Python   PyTorch     Licence

Open Source Learning and the Democratization of AI

The National Center for Supercomputing Applications (NCSA) Center of AI Innovation (CAII) at the University of Illinois Urbana-Champaign has a focus on driving and enabling student-driven research. Their goal is to give access to the state-of-the-art tools and hardware so people can find novel ways to solve unanswered problems. I am one such student that has learned an incredible amount of knowledge and gained a lot of intuition on Artificial Intelligence through my mentors here.

Along with the NCSA, I also want to acknowledge all the amazing open source materials that I have found and learned from over the years. These resources often filled the gap for me between the theory that justified the model architecture and the actual implementation of them. I will do my best to reference all that work throughout, so you can see where I had learned it from initially! The purpose of this repository is to bring together that wealth of knowledge to truly be a one-stop-shop for everyone from beginners to researchers to gain something from and continue to push the boundaries of Open Source Research!!

If you want to contribute, (and please do it you want!!) go ahead an submit a PR and I can review it!

Getting Started

All of these tutorials can easily be run on the HAL system at the NCSA and you can follow the new users instructions here to setup an account.

If you prefer to use Google Colaboratory, that will also work fine! You will just need to setup the environment for specific packages needed (Easy pip installs to get those). For the datasets, you can save them in your Google Drive and access them from there!

Data Prep

We will be using a couple of datasets in our Deep Learning Adventures!!

Ensure you have a /data folder in your root directory of the git repo and run the following to install all datasets

bash download_data.sh 

Extra Datasets

There are a few other datasets that we will use but are inconsistent to automatically download and are used in the more advanced architectures! Just download them from the link and save them in the /data folder! These datasets may also be too large to train in Google Drive so keep that in mind!

Foundations

Computer Vision

Natural Language Processing

Speech Processing

  • Intro to Audio Processing in PyTorch
  • Connectionist Temporal Classification Loss
  • Intro to Automatic Speech Recognition
  • ASR through Self-Supervised Learning: Wav2Vec2
  • RNN Transducer as an Alternative to CTC

Generative AI

MultiModal AI

  • Building Vision/Language Representations: CLIP
  • Automatic Image Captioning
  • Visual Question Answering

Dive into Attention

  • Barebones Attention Mechanism
  • Sparse Windowed Attention
  • Linear Attention

Sequence to Sequence Modeling

  • Seq2Seq for Language Translation
  • CNN/RNN for Image Captioning
  • Attention is All You Need for Language Translation

Reinforcement Learning

  • Q-Learning
  • Deep-Q Learning

About

This repository contains an exhaustive coverage of a hands on approach to PyTorch along side powerful tools to accelerate model tuning and training

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages