Skip to content

ashioyajotham/Natural-Language-Processing

Repository files navigation

Natural-Language-Processing

Welcome to my Natural Language Processing (NLP) diary ^_^.

What are transformers?

  • Transformers are a type of neural network architecture that allow for parallelization across the sequence. This means that the network can process all of the tokens in the sequence at the same time, rather than having to process them sequentially. This is a huge advantage over RNNs, which must process tokens sequentially.
  • It was introduced through the paper Attention Is All You Need in 2017 which can be found in the Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017). by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin
  • Below is a diagram of the Transformer architecture: Transformer Architecture
  • Sebastian Ratchka sums it well here

Fave paper so far:


  • This paper presents a compelling case that purported emergent abilities in LLMs are highly dependent on the metrics employed, challenging the community to reassess the foundational understanding of how LLMs evolve with scale.

First favourites in research*****************

  1. Exploiting Novel GPT-4 APIs

  2. Orca: Progressive Learning from Complex Explanation Traces of GPT-4


Research focus

  1. Alignment
  1. AI Safety (particularly interested in red-teaming)
  2. "Hallucination" problem
  1. Interpretability