Simulation Study: Introducing FasttextSum #1420
rohitgarud
started this conversation in
Show and tell
Replies: 1 comment 7 replies
-
|
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone,
Here I am introducing ASReview-FasttextSum, a new feature extraction technique (new in ASReview, not novel) based on the weighted sum of FastText word embeddings for generating the feature vectors (document embeddings). There are five different options of weighting (pooling) available: mean, TFIDF (Term Frequency Inverse Document Frequency), TF (Tem Frequency), Log base 2 of TF and Smoothed Inverse Frequency (SIF). The stopwords can also be removed.
SIF pooling performs well in the case of most of the datasets. The performance in terms of ATD can be improved by increasing the number of epochs. The presented results are for only 33 epochs of Fasttext model training. The stopwords were kept during this simulation study.
Beta Was this translation helpful? Give feedback.
All reactions