Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend sentiment.md #1357

Open
michael-evrythngwrx opened this issue Sep 4, 2023 · 3 comments
Open

Amend sentiment.md #1357

michael-evrythngwrx opened this issue Sep 4, 2023 · 3 comments

Comments

@michael-evrythngwrx
Copy link

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Currently there isn't a reference for the absolute min and max sentiment score after processing.

Describe the solution you'd like
A clear and concise description of what you want to happen.
Can you provide the scale of sentiment in the sentiment.md file. For example: -1 -> 1

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@jesus-seijas-sp
Copy link
Contributor

jesus-seijas-sp commented Sep 5, 2023

The sentiment analysis is done with Lexicons, and there are three type of possible lexicons for each language:

In sentiment analysis usually you look for a classification into 3 classes: negative, neutral and positive. So if the score is negative, then the sentiment is negative; if the score is 0, then the sentiment is neutral; if the score is positive, then the sentiment is positive.

The result of the sentiment analysis indicates the type of lexicon used in the property type.

Example of response of the sentiment analysis:

 { score: 0.313,
   numWords: 3,
   numHits: 1,
   comparative: 0.10433333333333333,
   type: 'senticon',
   language: 'en' }
  • score will contain the total score for the sentence calculated with the lexicon
  • numWords will contain the number of words of the sentence
  • numHits will contain the number of words of the sentence that are in the lexicon dictionary
  • comparative will contain score / numWords
  • type will contain 'afinn', 'senticon' or 'pattern'
  • language: the language

To understand better AFINN I recommend the lecture of the original paper from Finn Årup Nielsen:
https://arxiv.org/abs/1103.2903

To understand the accuracy between different methods of lexicon sentiment analysis,
https://www.researchgate.net/publication/343473213_Evaluating_the_performance_of_the_most_important_Lexicons_used_to_Sentiment_analysis_and_opinions_Mining

In NLP.js there are two main improvements: negations and calculation of the stem of words.
There is also a paper analyzing improvements on sentiment analysis, not all used in NLP.js:
https://www.sciencedirect.com/science/article/pii/S2090447921003105

I hope you enjoy the reading

@michael-evrythngwrx
Copy link
Author

michael-evrythngwrx commented Sep 5, 2023

Here is something I am running into:
{
sentiment: {
score: 1.563,
numWords: 20,
numHits: 7,
average: 0.07815,
type: 'senticon',
locale: 'en',
vote: 'positive'
}
}

Senticon is returning a score greater than 1.

@jesus-seijas-sp
Copy link
Contributor

In my previous comment: "score will contain the total score for the sentence calculated with the lexicon"
Total = Sum of several
numHits is 7, so there are 7 words in your sentence that are in the lexicon, as numWords is 20 that means that there are 13 words not present in the lexicon.
1.563 is the sum of all scores of this 7 words.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants