Skip to content

martigso/NorSentLex

Repository files navigation

NorSentLex

Github Version

This repository is a R-format version of the Norwegian sentiment lexicons as shown in Barnes et.al (2019).

Installation

The package can be installed by using the install_github() function from the devtools package in R:

devtools::install_github("martigso/NorSentLex")
library(NorSentLex)

?nor_fullform_sent
?nor_lemma_sent

Structure and usage

The package mirrors the structure of the vanilla NorSentLex repository, but in a typical R type format. There are two available datasets: nor_fullform_sent and nor_lemma_sent. These can be easily loaded in R:

data("nor_fullform_sent", package = "NorSentLex")
data("nor_lemma_sent", package = "NorSentLex")

The data are structured as follows:

Token form Sentiment POS
Fullform Positive
Negative
TBD
Lemma Positive



Negative
adjective
noun
participle adjective
verb
adjective
noun
participle adjective
verb

Fullform

The fullform data contains a list with one element (“positive”) of 6103 positive fullform tokens and one element (“negative”) of 14839 negative fullform tokens. These can be extracted by name after loading the data into R (see above):

nor_fullform_sent$positive |> 
  head()
## [1] "absolutt"    "absolutta"   "absolutte"   "absoluttene" "absolutter" 
## [6] "absoluttet"
nor_fullform_sent$negative |> 
  head()
## [1] "abnorm"   "abnorme"  "abnormt"  "abort"    "aborten"  "abortene"

Lemma

The lemmatized part of the data contain a list element for positive and negative lexicons for each of the following parts-of-speech: adjective, noun, participle adjective, and verb:

names(nor_lemma_sent)
## [1] "lemma_adj_negative"  "lemma_adj_positive"  "lemma_noun_negative"
## [4] "lemma_noun_positive" "lemma_padj_negative" "lemma_padj_positive"
## [7] "lemma_verb_negative" "lemma_verb_positive"

These lexicons can also be extracted by calling the names within the list:

nor_lemma_sent$lemma_noun_positive |> 
  tail()
## [1] "åpenbaring" "ærbødighet" "ære"        "ærlighet"   "økning"    
## [6] "ønske"

References

Barnes et al. (2019) Lexicon information in neural sentiment analysis: a multi-task learning approach. Proceedings of the 22nd Nordic Conference on Computational Linguistics. Turku, Finland ACL Anthology