- Here, I have done a full data analysis and visualization on the top movies database using python libraries like: pandas, NumPy and Matplotlib.
The goal of the analysis is to answer 3 questions and draw conclusions through the analysis process using visualization tools in python.
- What are the most popular genres from year to year?
- What are the properties of the highest revenue films?
- What is the inflation rate throughout the time?
- Data Wrangling phase:
- throughout this phase, I investigated the dataset looking for duplicated rows and null values to handle them in a nice way where I don't spoil the data.
- fixing columns data type to get accurate results.
- removing unuseful columns from the dataset to keep the process nice and clean.
- Exploratory Data Analysis phase:
- throughout this phase, I answered the 3 questions I stated above, by doing some statistic calculations.
- supporting those calculations with visuals to make it easy to understand.
- drawing conclusions about each result.
- Conclusions phase:
- throughout this phase, I explained my findings and insights I got from data to non-technical clients.
- the limitation I faced throughout the process.