Skip to content

pvalle6/Tokenizer_and_Bigram

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

This is a simple implementation of a Byte Pair Encoding Algorithm, used to tokenize text, and a Bigram Word Model.

These were created as part of my research word in Language Models and are my original implementation!

https://www.linkedin.com/in/peter-v-334609211/

About

This is my simple and readable implementation of the Byte Pair Encoding Algorithm and a Bigram Model.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages