Skip to content

gerardo/glossika-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Glossika Sentence Extractor

A series of scripts to extract sentences from Glossika PDF course files.

Right now, it's custom made for a triangulation package, specifically, English > German > Mandarin.

Dependencies

On Mac:

brew install poppler

What it does right now

In order:

  1. Extracts raw text from PDF
  2. Gets all sentences
  3. Extracts sentences by language
  4. Extracts IPA transcriptions

About

Extract sentences from Glossika PDF course files

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published