Skip to content

ianjeffries/hotel-review-text-mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hotel-review-text-mining

alt text

Index

  1. Summary
  2. File Directory
  3. Language and Packages Used
  4. Credits
  5. License

Summary

The following project utilizes R to mine sentiment from over 21,000 hotel reviews on resorts located in the Republic of Maldives, a South Asian country located in the Indian Ocean.

File Directory

  1. data - contains the three files used in analysis:
             a. maldives_hotel_reviews.csv - Hotel reviews of resorts in the Republic of Maldives.
             b. negative-lexicon.txt - Negative lexicon used to locate "negative" words.
             c. positive-lexicon.txt - Positive lexicon used to locate "positive" words.

  2. images - contains vizualizations:
             a. body_wordcloud.png - Wordcloud showing commonly occuring words in the review body.
             b. header_wordcloud.PNG - Wordcloud showing commonly occuring words in the review header.
             c. monthly_sentiment.png - Overall sentiment by month for all hotels in the Republic of Maldives.
             d. reviews_by_year.png - Count of reviews by year.
             e. sentiment_comparison.png - Comparision of negative and positive wordcounts.
             f. top_12_hotels.png - Top 12 resorts sentiment comparison.

  3. text_mining.Rmd - R Markdown detailing the text mining process.

  4. text_mining.pdf - PDF that shows R code and the outputted results, for easy viewing.

  5. results.pdf - A full write-up comparing text mining in R vs SAS.

Language and Packages Used

R is used for all model building - the results are compared in R vs SAS.

The following packages are used:

#list of packages used
packages <- c("tm", "wordcloud", "lubridate", "SnowballC", "ggplot2", "dplyr", "tidyr")

#check to see if package is already installed
for(p in packages){
if(!require(p, character.only = TRUE)) {
  install.packages(p)
  library(p, character.only = TRUE)
}
}

Credits

  1. Would like to thank Dr. Mo Saraee from the University of Salford for the maldives_hotel_reviews.csv dataset.
  2. Would like to thank Bing Liu and Minqing Hu for the negative-lexicon.txt and positive-lexicon.txt files, which were taken off of their website.

License

MIT License Copyright (c) 2019 Ian Jeffries

About

Using R and text mining to mine sentiment from over 21,000 hotel reviews on resorts located in the Republic of Maldives.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published