Skip to content

OleksandraLaurent/Investigating_TMDb_Movie_Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Investigating_TMDb_Movie_Dataset

Investigation of data associated with collection of movies from TMDb. This project was done as part of Udacity's Data Analyst Nanodegree.

The TMDb Movie dataset has been selected for investigation using NumPy and Pandas. The dataset is a collection of information on around 10000 movies collected from The Movie Database (TMDb),including user ratings and revenue.

  • Certain columns, like ‘cast’ and ‘genres’, contain multiple values separated by pipe (|) characters.
  • The final two columns ending with “_adj” show the budget and revenue of the associated movie in terms of 2010 dollars, accounting for inflation over time.

Outline of analysis

  • Introduction
  • Data Wrangling
  • Exploratory Data Analysis
  • Conclusions

Research Questions

  • How release numbers have changed over the years?
  • What are the Top 10 highest grossing movies? Is the list different from movies with the biggest profit?
  • Which directors/production companies/actors are associated with most profitable movies?
  • Is popularity of movie associated with month of release?

About

Investigation of data associated with collection of movies from TMDb.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published