Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

empath visualisation doesn't work with non binary categories #31

Open
swartchris8 opened this issue May 1, 2018 · 2 comments
Open

empath visualisation doesn't work with non binary categories #31

swartchris8 opened this issue May 1, 2018 · 2 comments

Comments

@swartchris8
Copy link

Can't click on nodes in the empath visualisation to see the relevant text. Get the below error with diffrent property numbers when clicking on them and text is not rendered under the visualisation.

Browser error:

Billingpayment-Visualization.html:4484 Uncaught TypeError: Cannot read property '14' of undefined
    at searchInExtraFeatures (Billingpayment-Visualization.html:4484)
    at gatherTermContexts (Billingpayment-Visualization.html:4453)
    at SVGTextElement.<anonymous> (Billingpayment-Visualization.html:5027)
    at SVGTextElement.<anonymous> (d3.min.js:2)

Python code to generate visualisation:

import scattertext as st
from IPython.display import IFrame

convention_df = st.SampleCorpora.ConventionData2012.get_data()
convention_df["party"].iloc[3] = "liberal"
convention_df["party"].iloc[4] = "republican"
convention_df["party"].iloc[5] = "liberal"
convention_df["party"].iloc[6] = "republican"

empath_corpus = st.CorpusFromParsedDocuments(convention_df.iloc[:15],
                                             category_col="party",
                                             feats_from_spacy_doc=st.FeatsFromOnlyEmpath(),
                                             parsed_col="text").build()

html = st.produce_scattertext_explorer(empath_corpus,
    category = 'democrat',
    category_name = 'democrat',
    not_category_name = "Not democrat",
    width_in_pixels=1000,
    use_non_text_features=True,
    use_full_doc=True)

file_name = 'democrat.html'
open(file_name, 'wb').write(html.encode('utf-8'))
IFrame(src=file_name, width = 1200, height=700)

Your Environment

  • Operating System: Ubuntu
  • Python Version Used: 3.6
  • Scattertext Version Used: 0.0.2.25
  • Environment Information:
  • Browser used (if an HTML error): Chrome, Chromium tested
@swartchris8
Copy link
Author

Seems like the issue isn't with the multiple categories just the empath visualisation following snippet with 2 categories still fails:

import scattertext as st
from IPython.display import IFrame

convention_df = st.SampleCorpora.ConventionData2012.get_data()
convention_df["party"].iloc[3] = "liberal"
convention_df["party"].iloc[4] = "republican"
convention_df["party"].iloc[5] = "liberal"
convention_df["party"].iloc[6] = "republican"
convention_df[convention_df["party"] != "democrat"]["party"] = "not democrat"

empath_corpus = st.CorpusFromParsedDocuments(convention_df[:14],
                                             category_col="party",
                                             feats_from_spacy_doc=st.FeatsFromOnlyEmpath(),
                                             parsed_col="text").build()

html = st.produce_scattertext_explorer(empath_corpus,
    category = 'democrat',
    category_name = 'democrat',
    not_category_name = "Not democrat",
    width_in_pixels=1000,
    use_non_text_features=True,
    use_full_doc=True)

file_name = 'democrat.html'
open(file_name, 'wb').write(html.encode('utf-8'))
IFrame(src=file_name, width = 1200, height=700)

@JasonKessler
Copy link
Owner

Thanks for the bug report.

I just made some significant improvements to the topic modeling component in Scattertext. You can not only view documents that match an empath category, but if you add

topic_model_term_lists=st.FeatsFromOnlyEmpath().get_top_model_term_lists()

as a parameter to produce_scattertext_explorer, it will bold the terms associated with the empath category. Please see https://github.com/JasonKessler/scattertext#visualizing-topic-models for more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants