GitHub

Hi there 👋

I'm Stéphan Tulkens! I'm a computational linguistics/AI person. I am currently working as a machine learning engineer/NLP scientist at Metamaze, where I work with transformers and generative AI models to automate document processing.

I got my Phd at CLiPS at the University of Antwerpen under the watchful eyes of Walter Daelemans (Computational Linguistics) and Dominiek Sandra (Psycholinguistics). The topic of my Phd was the way people process orthography during reading. You can find a copy here. Before that I studied computational linguistics (Ma), philosophy (Ba) and software engineering (Ba)

My goal is always to make things as fast and small as possible. I like it when simple models work well, and I love it when simple models get close in accuracy to big models. I do not believe absolute accuracy is a metric to be chased, and I think we should always be mindful of what a model computes or learns from the data.

I’m currently working on 🏃‍♂️:

reach: a library for loading and working with word embeddings.
piecelearn: a library that trains a subword tokenizer and embeddings on the same corpus, giving you open vocabulary embeddings.
unitoken: a library for easy pre-tokenization.
hashing_split: a library for hash-based data splits (stable splits!)

Other stuff I made (most of it from my Phd) 🐕:

wordkit: a library for working with orthography
old20: calculate the orthographic levenshtein distance 20 metric.
metameric: fast interactive activation networks in numpy.
humumls: load the UMLS database into a mongodb instance. Fast!
dutchembeddings: word embeddings for dutch (back when this was a cool thing to do)

My research interests 🤖:

Tokenizers, specifically subword tokenizers.
Embeddings, specifically static embeddings (so old-fashioned! 💀), and how to combine these in meaningful ways.
String similarity, and how to compute it without using dynamic programming.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

Hi there 👋

I’m currently working on 🏃‍♂️:

Other stuff I made (most of it from my Phd) 🐕:

My research interests 🤖:

Contact:

About

Releases

Packages

Contributors 2

stephantul/stephantul

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

Hi there 👋

I’m currently working on 🏃‍♂️:

Other stuff I made (most of it from my Phd) 🐕:

My research interests 🤖:

Contact:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages