Skip to content

A text analysis project on collection of script dialogue between characters for the episode 4,5,6 of star wars

Notifications You must be signed in to change notification settings

sridharvaranasi/star-wars-text-analysis

Repository files navigation

star-wars-text-analysis

A text analysis project on collection of script dialogue between characters for the episode 4,5,6 of star wars

Getting started

Star Wars is a popular film franchise that takes place in a galaxy far, far away. This is a collection of script dialogue between characters for the first three movies (episodes 4-6). Since it's a holiday (and just because Star Wars is an awesome movie), this data should serve as a fun way to implement text mining and linguistics.

The source files are as listed below:

SW_EpisodeIV.txt - Script from the Episode IV: A New Hope with columns character and dialogue.

SW_EpisodeV.txt - Script from the Episode V: The Empire Strikes Back with columns character and dialogue.

SW_EpisodeVI.txt - Script from the Episode VI: Return of the Jedi with columns character and dialogue.

Software used

I have used R's tidytext, tm, wordcloud packages for doing text analysis. For cleansing the data I have used dplyr and ggplot2 packages of R.

Finally, I have created an ipython notebook with text analysis and have also checked-in an R markdown file.

Author

Sridhar Varanasi

About

A text analysis project on collection of script dialogue between characters for the episode 4,5,6 of star wars

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published