Skip to content

TCS Enterprise Intelligent Automation – ARITIFICIAL INTELLIGENCE Competition Repository for TCS Enterprise Intelligent Automation – ARITIFICIAL INTELLIGENCE Competition. datasets are taken from Kaggle Quora competition.

Notifications You must be signed in to change notification settings

kumarnalinaksh21/kaggle

Repository files navigation

TCS Enterprise Intelligent Automation – ARITIFICIAL INTELLIGENCE Competition

Repository for TCS Enterprise Intelligent Automation – ARITIFICIAL INTELLIGENCE Competition.

datasets are taken from Kaggle Quora competition.

Dependencies

Overview

Overview

Feature Extraction on Sentences

We extract doc2vec features on sentences using the gensim library and a pretrained doc2vec (DBOW) model trained on the English Wikipedia dataset. The pretrained model is available here.

The code for feature extraction is here: feat.py.

To use it:

python feat.py

Network Training using doc2vec features with Softmax CrossEntropy Loss

The code is here: train.py.

To log per iteration loss use:

bash train_with_logging.sh

Training logs are available in log.info.

Training Loss

Testing the dataset using doc2vec features with trained network

The code is here: test.py.

To use it:

python test.py

The output (id_,prob_) is saved in the text file test_probs.txt.

Low Level Details

  • We eliminate all punctuation marks from the sentences in a preprocessing step
  • We use input flipping dataset augmentation scheme: the network is robost to the order in which the two sentences are presented to it
  • At test time we compute probabilities corresponding to the two input flips and these are averaged
  • The network definition is available in helpers/network.py

About

TCS Enterprise Intelligent Automation – ARITIFICIAL INTELLIGENCE Competition Repository for TCS Enterprise Intelligent Automation – ARITIFICIAL INTELLIGENCE Competition. datasets are taken from Kaggle Quora competition.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published