Skip to content

artxgj/twitter-chinese-text

Repository files navigation

twitter-chinese-text

I use Twitter as a tool to improve my reading skills in Chinese. I download periodically tweets from various News sources. This repo has a few simple scripts that extract the downloaded tweets-archive and transform the tweets stored in a .js file into csv and markdown files.

I rely on the Pleco app, Wiktionary, Google's Translation site and Apple's language-translation features (iOS and MacOS) to look up the many words that are incomprehensible to me.

These are the three markdown files that contain my lists of curated vocabulary words and names that I am studying.

  1. words_curated_study_list.md

    A subset of curated words with their English definitions.

  2. words_tweets_stats.md.

    The entire set of curated words.

  3. companies_tweets_stats.md.

    A list of companies/brands as they are called in Chinese.

N-grams

Chinese n-grams (1 <= n <= 10) extracted from tweets.