This the final project for the Computational Social Science course 2022, for Masters in Data Science at UCU 21-23. It contains the exploration of personal communication patterns based on analysis of all dialog data from Telegram. The flow of exploration is guided by the assignments. It consists of 2 parts - first attempts (under the headline Homework 5) and improved final results (under the headline Homework 6). The collection of dataset with personal communication was obtained with the help of repository https://github.com/SanGreel/telegram-data-collection. This project might serve as an inspiration into chat data exploration, though structure and focus is heavily influenced by the personal experience.
To reproduce the flow of exploration, first download your personal Telegram data with with https://github.com/SanGreel/telegram-data-collection. Then proceed to .ipynb file which contains data manipulation and visualization.
umap-learn
bertopic
gensim
sklearn
matplotlib==3.6.2
notebook==6.5.1
pandas
fasttext
stanza
Python >=3.7
spacy
plotnine