Skip to content
This repository has been archived by the owner on Nov 8, 2020. It is now read-only.

A Question About data_processor.populate_word_blacklist #72

Open
Doragd opened this issue Nov 2, 2019 · 0 comments
Open

A Question About data_processor.populate_word_blacklist #72

Doragd opened this issue Nov 2, 2019 · 0 comments

Comments

@Doragd
Copy link

Doragd commented Nov 2, 2019

def populate_word_blacklist(word_index):
    blacklisted_words = set()
    blacklisted_words |= set(global_config.predefined_word_index.values())
    if global_config.filter_sentiment_words:
        blacklisted_words |= lexicon_helper.get_sentiment_words()
    if global_config.filter_stopwords:
        blacklisted_words |= lexicon_helper.get_stopwords()
  1. The output of global_config.predefined_word_index.values() are indices of some words, not words.
  2. At this point, the actual value of this global_config.predefined_word_index is equal to word_index, not only {'<unk>': 0,'<sos>': 1,'<eos>': 2}.
  3. Therefore, I think that this blacklisted_words contains unnecessary words and does not match the meaning of the blacklist.
@Doragd Doragd changed the title An Question About data_processor.populate_word_blacklist A Question About data_processor.populate_word_blacklist Nov 2, 2019
@Doragd Doragd changed the title A Question About data_processor.populate_word_blacklist A Question About data_processor.populate_word_blacklist Nov 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant