Skip to content

AntoinePassemiers/Lexicon-Based-Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build status

LBSA - Lexicon-based Sentiment Analysis

Fast library for sentiment analysis, opinion mining and language detection.

Installation

Install dependencies:

$ sudo pip3 install requirements.txt

From the parent folder, install the library by typing the following command:

$ sudo python3 setup.py install

To access the NRC lexicon, download it from: http://www.saifmohammad.com/WebDocs/Lexicons/NRC-Emotion-Lexicon.zip

Extract it, and provide the path to the excel file the first time you use the NRC lexicon. For example:

>>> path = 'path/to/NRC-Emotion-Lexicon-v0.92-In105Languages-Nov2017Translations.xlsx'
>>> sa_lexicon = lbsa.get_lexicon('sa', language='english', source='nrc', path=path)

Dependencies

  • numpy >= 1.13.3
  • pandas >= 0.21.0
  • xlrd

Features

Sentiment analysis

>>> import lbsa
>>> tweet = """
... The Budget Agreement today is so important for our great Military.
... It ends the dangerous sequester and gives Secretary Mattis what he needs to keep America Great.
... Republicans and Democrats must support our troops and support this Bill!
... """
>>> sa_lexicon = lbsa.get_lexicon('sa', language='english', source='nrc')
>>> sa_lexicon.process(tweet)
{'anger': 0, 'anticipation': 0, 'disgust': 0, 'fear': 2, 'joy': 0, 'sadness': 0, 
'surprise': 0, 'trust': 3}

Opinion mining

>>> op_lexicon = lbsa.get_lexicon('opinion', language='english', source='nrc')
>>> op_lexicon.process(tweet)
{'positive': 2, 'negative': 1}

Language detection

Language detection requires the NRC lexicon:

>>> import lbsa
>>> tweet = """
... A la suite de la tempête #Eunice et à la demande du Président de la République,
... lEtat décrétera dans les meilleurs délais létat de catastrophe naturelle partout
... où cela savérera nécessaire.
... """
>>> lexicon = lbsa.get_lexicon('sa', language='auto', source='nrc')
>>> print(lexicon.process(tweet))
{'anger': 2, 'anticipation': 1, 'disgust': 1, 'fear': 2, 'joy': 0, 'sadness': 2, 'surprise': 2,
'trust': 0, 'lang': 'french'}

Feature extractor

>>> extractor = lbsa.FeatureExtractor(sa_lexicon, op_lexicon)
>>> extractor.process(tweet)
array([0., 0., 0., 2., 0., 0., 0., 3., 2., 1.])

Example

Feature extractor:

feature_extraction.py

alt text

Perform sentiment analysis over time on "Thus spoke Zarathustra":

book.py

About

Lexicon-based sentiment analysis inspired by Syuzhet R package

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages