Skip to content

jeffreyshen19/TranscribePA

Repository files navigation

TranscribePA

An extensible platform for rapidly transcribing historical documents using machine-learning and crowdsourcing.

Why?

Many institutions like libraries and museums have large collections of scanned, handwritten or typed documents. However, since these documents are not transcribed, they can not be easily searched, read, or copy and pasted, greatly hindering the usefulness of these collections. TranscribePA aims to fix this issue, creating an extensible platform which leverages the power of machine learning and humans to rapidly convert pictures of document into actual text.

How Does It Work?

Upload your institution's documents as collections. Once configuring a few settings, a machine learning algorithm will automatically go through each file and transcribe it. Then, users can use the online interface to edit or verify documents.

Features

  • Built in Search: Once your users have finished transcribing documents, they should be searchable as well!

    • Browse page to view all documents
    • Full-text search
    • Option to download data as JSON
  • Stylish by default: The website looks modern and professional. If you want to change the styles, however, that is also easy to do.

  • Incredibly simple installation:

    git clone https://github.com/jeffreyshen19/TranscribePA
    cd TranscribePA
    npm run setup
    
  • Easily configurable: Customize TranscribePA for your own institution, using just one config file.

  • Completely open-source and free

Documentation

See docs/ for detailed instructions on configuration, installation, and more.

Contributing

Please read CONTRIBUTING.md and the Code of Conduct for instructions on how to contribute to this project.

License

This project is licensed under the MIT License - see the LICENSE file for details

About

An extensible platform for rapidly transcribing historical documents using machine-learning and crowdsourcing.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published