Personality Analysis Framework

This framework allows users who tweet in Turkish to get an estimation of their Big Five OCEAN Personality scores via their tweets.

As of October 2022, this project is inactive and no longer maintained.

Academic Background

For academic background of this framework, please refer to Clustering based Personality Prediction on Turkish Tweets by Tutaysalgir, E., Karagoz, P. and Toroslu, I.H., 2019, August.

Installation

Please refer to INSTALL.md

Requirements

Python 3.8
Java 8
MySQL or MariaDB

For a list of required Python libraries, refer to requirements.txt.

Services List

This framework uses 4 services to run properly. Namely,

Zemberek service
Word2Vec service
Backend REST service via Flask
Frontend service via React

Zemberek Service

Zemberek provides the NLP functionalities for the framework. The communication between the framework and Zemberek is achieved through gRPC.

To run Zemberek service, you can use run_zemberek.sh.

Word2Vec service

Word2Vec service provides an API to get Word2Vec representations of words. You can use run_word2vec.sh to run the service.

Backend REST service via Flask

The backend of this project relies on a RESTful API to allow any frontend or application to make requests and get responses from it. To achieve this, we use Flask, which has a very simple but powerful interface.

This service is by nature multithreaded, and it spawns a new thread for every new vector calculation request to handle multiple users at a time. If you expect a huge number of users at the same time, it might be useful to use autoscaling solutions.

Frontend service via React

The frontend is purely for aesthetics, and it is extremely under-developed. As long as making the proper REST requests, any frontend should work just fine.

Command Line Usage

Even though this framework is designed as a web tool, it is possible to use this through a terminal window as well. In order to do so, follow these steps after installation:

Run Zemberek and Word2Vec services.
Fill in auth_pair in main.py with your credentials.
Run python main.py username if you want to give username as an argument, or python main.py to enter username when asked.
You will receive the predicted OCEAN score in the terminal.

Reading from a CSV file

It is possible to download tweets of a user and use this csv file for the vector construction. In order to do so, do the following steps:

Fill in auth_pair in download-tweets.py with your credentials.
Download tweets of a user via python download-tweets.py. This will save the tweets in the data/tweets folder.
Run the vector constructor via python main.py username --file arguments. This will read the tweets from the csv file instead of downloading them from scratch.

This method was implemented to avoid burning out the Twitter API for recurrent downloads.

Made in Ankara with 💙

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
data/tweets		data/tweets
frontend		frontend
predictors		predictors
twitter		twitter
utils		utils
vector		vector
web		web
zemberek		zemberek
.gitignore		.gitignore
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
config_boundaries.sh		config_boundaries.sh
download-tweets.py		download-tweets.py
main.py		main.py
requirements.txt		requirements.txt
router.py		router.py
run_word2vec.sh		run_word2vec.sh
run_zemberek.sh		run_zemberek.sh
setup.py		setup.py
train_model.sh		train_model.sh

License

frozsgy/personality-analysis-framework

Folders and files

Latest commit

History

Repository files navigation

Personality Analysis Framework

Academic Background

Installation

Requirements

Services List

Zemberek Service

Word2Vec service

Backend REST service via Flask

Frontend service via React

Command Line Usage

Reading from a CSV file

About

Topics

Resources

License

Stars

Watchers

Forks

Languages