These words shouldn't be considered 'positive' #3

tonyjiang · 2013-07-12T07:40:45Z

0.875 ill-mannered
0.5 brutally
0.5 boneheaded
0.5 cynically
0.5 cutthroat
0.5 dishonestly
0.5 dishonestly

I wonder where you got the list?

jemminger · 2013-07-12T14:27:44Z

Taken directly from here: https://github.com/cmaclell/Basic-Tweet-Sentiment-Analyzer

jemminger · 2013-07-12T14:30:02Z

What values do you propose they be? Just make them negative?

agarie · 2013-12-18T01:52:51Z

Maybe...

-0.5 ill-mannered
-0.40 boneheaded
-0.75 dishonestly
-0.875 brutally
-0.875 cynically
-0.875 cutthroat

I tried to compare these words to other found in sentiwords.txt. However, I'm not a native speaker, so I might be wrong.

tonyjiang · 2013-12-18T02:00:06Z

@agarie - 'cynically' can't be as bad as 'brutally', nor is it worse than 'dishonestly'.

agarie · 2013-12-18T02:10:38Z

@tonyjiang interesting. I didn't know about 'cynically' / 'dishonestly' (in portuguese it's actually the reverse). Do you have any suggestions?

edmondlafay · 2016-04-19T07:38:52Z

The true question is how the dictionaries was created :
These word lists are generally computed from large corpuses of texts annotated by hand as positive or negative. One of the simpler approches which has probably been used here is to run a program that sums the number of times a word is seen in positive and negative texts and normalizes it by the number of text it appears in. Therefore the words scores don't reflect the actual meaning of the word, but how people use them in texts.
The problem is that these scores are very dependent on the corpuses you use : if your corpus is from classic literature, your dictionaries will have more words, and the words in the texts will be used with a more literal manner than if your corpus is a set of random tweets where people use a limited amount of vocabulary and word meaning is more popular usage.
Therefore it is best to rebuild a dictionary for that better suits your usage of the gem, and the given dictionaries should just be used as default settings.

cromulus · 2016-12-29T20:03:27Z

Might I suggest this corpus of word sentiments?

Sentiword is the current best in class corpus of words and sentiments: http://sentiwordnet.isti.cnr.it/

It's a rather large DB, perhaps it might be useful as an alternative?

It's format is slightly different: each word is scored both as positive and negative, from 0 to 1. Some words have a score for both positive and negative. Perhaps we just subtract negative from positive?
Unclear.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

These words shouldn't be considered 'positive' #3

These words shouldn't be considered 'positive' #3

tonyjiang commented Jul 12, 2013

jemminger commented Jul 12, 2013

jemminger commented Jul 12, 2013

agarie commented Dec 18, 2013

tonyjiang commented Dec 18, 2013

agarie commented Dec 18, 2013

edmondlafay commented Apr 19, 2016

cromulus commented Dec 29, 2016 •

edited

These words shouldn't be considered 'positive' #3

These words shouldn't be considered 'positive' #3

Comments

tonyjiang commented Jul 12, 2013

jemminger commented Jul 12, 2013

jemminger commented Jul 12, 2013

agarie commented Dec 18, 2013

tonyjiang commented Dec 18, 2013

agarie commented Dec 18, 2013

edmondlafay commented Apr 19, 2016

cromulus commented Dec 29, 2016 • edited

cromulus commented Dec 29, 2016 •

edited