Text-Classification---R

Classifying documents into categories

The file "Topic Modelling" is based on the blog post Topic Modeling in R.

BEfore running on Windows I had some setup to do:

Install R from https://cran.r-project.org/bin/windows/base/
Install RStudio from https://www.rstudio.com/products/rstudio/download/
Create a Twitter App to collect the data at https://apps.twitter.com/app/new - more details Twitter Analytics Using R Part 1: Extract Tweets

the values from the Twitter app need to go into these lines in "Topic Modelling":

Consumer_key<- "YOUR_CONSUMER_KEY"
Consumer_secret <- "YOUR_CONSUMER_SECRET"
access_token <- "YOUR_ACCESS_TOKEN"
access_token_secret <- "YOUR_TOKEN_SECRET"

Then in the console make sure all the packages required are installed (this is in the script "prereqs.R"):

install.packages('twitteR')
install.packages("tm")
install.packages("wordcloud")
install.packages("slam")
install.packages("topicmodels")
install.packages('base64enc')

Now you are ready to run "Topic Modelling" in RStudio! There are two lines near the head of the file that can be used to tweak the number and type of of results:

numTweets <- 900

tweetData <- searchTwitter("flight", n=numTweets, lang="en")

numTweets is used in the twitter search and later in to set the SEED for the analysis algorithm.

tweetData holds the data that we want to analyse. Here I have done a simple search for the text "flight" in English. See Getting Data via twitteR for more ideas. Look in "topic-exploration.R" and "Sentiment.R" for more ways to look at the data downloaded in "Topic Modelling.R"

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Cluster Analysis.R		Cluster Analysis.R
Movie Review Engine.R		Movie Review Engine.R
R Text classfication using CSV files.R		R Text classfication using CSV files.R
README.md		README.md
SentenceSimilarityApproaches.py		SentenceSimilarityApproaches.py
Sentiment.R		Sentiment.R
Topic Modelling.R		Topic Modelling.R
prereqs.R		prereqs.R
topic-exploration.R		topic-exploration.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster Analysis.R

Cluster Analysis.R

Movie Review Engine.R

Movie Review Engine.R

R Text classfication using CSV files.R

R Text classfication using CSV files.R

README.md

README.md

SentenceSimilarityApproaches.py

SentenceSimilarityApproaches.py

Sentiment.R

Sentiment.R

Topic Modelling.R

Topic Modelling.R

prereqs.R

prereqs.R

topic-exploration.R

topic-exploration.R

Repository files navigation

Text-Classification---R

About

Releases

Packages

Languages

neilchalk/Text-Mining

Folders and files

Latest commit

History

Repository files navigation

Text-Classification---R

About

Topics

Resources

Stars

Watchers

Forks

Languages