Skip to content

yannickfrommherz/Sprachmodelle-und-Word-Embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introducing language models and word embeddings 🧑‍💻

This repo contains a Jupyter notebook introducing to language models and word embeddings by training a word2vec model relying on datasets of 100K and 1M sentences from German news articles.

Prerequisites

Python and JupyterLab installed on your machine.

Instructions

  1. Run jupyterlab in your terminal.
  2. Clone this repo.
  3. Download this folder from Wortschatz Leipzig, unpack it and save the file "deu_news_2022_1M-sentences.txt" in the "data" folder. It is not provided in this repo as it exceeds 100 MB.
  4. Navigate to this repo using the file manager inside JupyterLab.
  5. Open "Notebook.ipynb" and enjoy!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published