Skip to content
@TextCorpusLabs

Text Corpus Labs

We are a collection of researchers focused on collecting different modes of human communication through text

Pinned

  1. wikimedia wikimedia Public

    Walk through to convert WikiMedia into a text corpus

    Python 2 1

  2. oas oas Public

    Walk through to convert PMC OAS Dataset into a text corpus

    Python

  3. VLNGramCounter VLNGramCounter Public

    NGram counter for large datasets

    Python

  4. building-blocks building-blocks Public

    Building blocks for text pre-processing

    Python

  5. Edgar Edgar Public

    Create a corpus from EDGAR data

    Jupyter Notebook

Repositories

Showing 10 of 11 repositories
  • oas Public

    Walk through to convert PMC OAS Dataset into a text corpus

    Python 0 MIT 0 0 0 Updated Mar 25, 2024
  • Edgar Public

    Create a corpus from EDGAR data

    Jupyter Notebook 0 MIT 0 0 0 Updated Mar 20, 2024
  • metadiscourse Public

    Template code for a Metadiscourse analysis

    1 MIT 0 0 0 Updated Aug 16, 2023
  • wikimedia Public

    Walk through to convert WikiMedia into a text corpus

    Python 2 MIT 1 0 0 Updated Jan 26, 2023
  • VLNGramCounter Public

    NGram counter for large datasets

    Python 0 MIT 0 0 0 Updated Jan 20, 2023
  • building-blocks Public

    Building blocks for text pre-processing

    Python 0 MIT 0 0 0 Updated Oct 5, 2022
  • NJGovNews Public

    Web scraping of the New Jersey news feeds

    Python 0 MIT 0 0 0 Updated Mar 10, 2022
  • congressional-votes Public

    Walk through to convert congressional roll call votes into a text corpus

    Python 0 MIT 0 0 0 Updated Jan 21, 2021
  • getting-started Public

    Getting started at Text Corpus Labs

    0 MIT 0 0 0 Updated Nov 19, 2020
  • covid19 Public

    Walk through to convert Kaggle's COVID-19 Open Research Dataset Challenge into a text corpus

    Python 0 MIT 0 0 0 Updated Mar 23, 2020

Top languages

Loading…

Most used topics

Loading…