This repository is a collection of models and scripts used for the 53rd Place Solution of the toxic comments challenge hosted on kaggle
TODOS:
- add models
- test models 1 by 1
- add preprocessing scripts
RNN with GRU cells and Average weighted attention layer
RNN with GRU cells topped with Capsules
https://arxiv.org/abs/1710.09829
Hierarchy attention network
https://www.cs.cmu.edu/~hovy/papers/16HLT-hierarchical-attention-networks.pdf
Hierarchy attention network with LSTM cells
Naive Bayes Logistic Regression
Gradient boosting with the xgboost package
https://xgboost.readthedocs.io/en/latest/
Gradient boosting with the light gradient boost package
- assets
- train.csv
- test.csv
- sample_submission
- glove.twitter.27B.200d.txt (GloVe embeddings)
- crawl-300d-2M.vec (fasttext embeddings)