Computational Intelligence Lab 2017

Project: Sentiment Analysis on Tweets

This repository contains final submission for Computational Intelligence Lab 2017, ETH Zurich project about Sentiment Analysis on tweets. The authors are given below. In root folder is the baseline (Paragraph Model) and in folder ac1d is the improved version based on recurrent neural networks with attention model. The project is implemented using Tensorflow library.

The final submission was trained on 2.5 million tweets. The data was provided by ETH Computational Intelligence Lab staff and cannot be disclosed.

Authors:

Jovan Nikolic (jovan.nikolic@gess.ethz.com)
Jovan Andonov (andonovj@student.ethz.ch)
Frederic Lafrance (flafranc@student.ethz.ch)

Project Report:

Can be found here.

Prerequisites:

Python 3.5.2
Tensorflow 1.0.0
gensim 1.0.1
numpy 1.12.1
tqdm 4.14.0
nltk 3.2.2

This script expects:

./data folder containing:
- train_neg_full.txt
- train_pos_full.txt
- train_neg.txt
- train_pos.txt
- test_data.txt
- gnews-vecs-neg300.bin

The first five files, you have. The Google News pretrained embeddings can be found here: https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit

./logs folder containing:
- exp-1-training-2017-06-19_04-36-12
  - exp-1-2017-06-19_04-36-12-ep-4.ckpt-78000.meta
  - exp-1-2017-06-19_04-36-12-ep-4.ckpt-78000.index
  - exp-1-2017-06-19_04-36-12-ep-4.ckpt-78000.data-00000-of-00001
- exp-1-training-2017-06-27_13-41-06
  - exp-1-2017-06-27_13-41-06-ep-4.ckpt-100500.meta
  - exp-1-2017-06-27_13-41-06-ep-4.ckpt-100500.index
  - exp-1-2017-06-27_13-41-06-ep-4.ckpt-100500.data-00000-of-00001
./pickled_vars folder containing:
- index2word.p
- vocab.p
- word2index.p
./word2vec folder containing:
- word_embeddings_full_200.word2vec
- word_embeddings_full_200.word2vec.syn1neg.npy
- word_embeddings_full_200.word2vec.wv.syn0.npy

The logs, pickled_vars and word2vec folders can be found here: https://drive.google.com/open?id=0B2Cv2-ukPoJrTEt6SFhVQXFmSGM

Format of the data must be one tweet per line, separate files for positives and negative tweets.

Parameters for RNN model:

All important parameters are given in ac1d/configuration.py class.

Running:

Running the advanced model:

In order to train seq2seq neural network, use the following command:

python3 ac1d/main.py -n <num_cores>

where <num_cores> indicates the number of cores used in Tensorflow (indicates level of parallelism). Running this script will also make predictions on test set.

The output of this run will be:

pickled_vars folder with the following content:
- vocab.p
- word2index.p
- index2word.p
logs folder with saved graphs of the trained network
submissions folder with submission.csv file

Running the baseline model:

To run the baseline model, use the following command:

python3 baseline.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ac1d		ac1d
report		report
LICENSE		LICENSE
README.md		README.md
baseline.py		baseline.py
data.py		data.py
euler_command.txt		euler_command.txt
final_submission_code.zip		final_submission_code.zip
ideas.txt		ideas.txt
pv_framework.py		pv_framework.py
pv_model.py		pv_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ac1d

ac1d

report

report

LICENSE

LICENSE

README.md

README.md

baseline.py

baseline.py

data.py

data.py

euler_command.txt

euler_command.txt

final_submission_code.zip

final_submission_code.zip

ideas.txt

ideas.txt

pv_framework.py

pv_framework.py

pv_model.py

pv_model.py

Repository files navigation

Computational Intelligence Lab 2017

Project: Sentiment Analysis on Tweets

Authors:

Project Report:

Prerequisites:

Parameters for RNN model:

Running:

About

Releases

Packages

Languages

License

jovan-ioanis/cil-project

Folders and files

Latest commit

History

Repository files navigation

Computational Intelligence Lab 2017

Project: Sentiment Analysis on Tweets

Authors:

Project Report:

Prerequisites:

Parameters for RNN model:

Running:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages