Skip to content

VinitaSilaparasetty/EDA-with-Python-for-ABCDE-Conference

Repository files navigation

Level Intermediate Python 3.8.2 Numpy 1.18.5 Scipy 1.4.1 Matplotlib 3.2.1 Pandas 1.0.4 Seaborn 0.10.1 Jupyter_Notebook 6.0.3 Pandas_Profiling 2.8.0 DOI

EDA with Python for ABCDE Conference

https://youtu.be/fnoVTxLz4U8

Workshop Kit

Learn to conduct Exploratory Data Analysis like an expert to get the most value out of your data using Python. In this workshop you will do a guided EDA with lots of helpful tips and best practices to improve your skills.

Update: The conference has now ended, but you can view the recording of my session here: https://youtu.be/fnoVTxLz4U8

The conclusions of the conference are available here: https://youtu.be/wf9ahUAbqa0

About the Instructor

Connect with me on Social Media: http://www.officialvinita.wordpress.com

Vinita Silaparasetty is a data science instructor/trainer, author and speaker. She is passionate about the field of Artificial Intelligence, particularly, Machine Learning, Deep Learning and Neural Networks. She aims to share this passion with others.

Preorder her latest book " Deep Learning Projects Using TensorFlow 2: Neural Network Development with Python and Keras", Published by Apress New York: https://www.apress.com/gp/book/9781484258019

Open In Colab

Instructions:

Method 1:

  • Download the Workbook from this repo.
  • Fill in the code as I demonstrate during the conference.

Method 2:

  • Click the 'Open in Colab' button.
  • Once again choose the option, 'Open in Colab'.
  • Make a copy of the notebook.
  • Fill in the code as I demonstrate during the conference.

About the Dataset

Name: Bulgarian PopFolk Music

Description:

Bulgarian pop-folk (hereinafter referred to as chalga) is a dance genre, stemming from ethno-pop, with strong hints of Oriental rhythms and instrumentals. Chalga is one of many branches of Balkan folk throughout the peninsula (turbofolk in Serbia, manele in Romania etc.) Payner LTD is a Bulgarian record label and production studio, founded in 1990. It is currently considered the largest record label in the country, producing mainly in both Bulgarian folk and chalga genres.

Contents:

Contains Spotify information of 610 resolved tracks, out of 638 detected in PlanetaOfficial, in the period 2014-2019. Every row is a track, and contains:

  • the unique Spotify ID of the song;
  • pre-processed names of the first three artists in a song (if such are present), according to their order of mention;
  • name of the track;
  • datetime of the video upload in PlanetaOfficial;
  • various Spotify audio features.

Modifications:

For the purpose of this workshop, the following changes have been made:

  • First 5 entries removed from the 'time_signature' column.
  • Some rows have been duplicated.

The dataset with the above changes has been uploaded in this repository.

Prerequisites:

Ensure that you have the following programmes and packages installed on your system. Installation Instructions can be found here:https://independent.academia.edu/VinitaSilaparasetty

  • Python 3
  • Numpy
  • Scipy
  • Matplotlib
  • Pandas
  • Pandas-Profiling
  • Seaborn
  • Jupyter Notebook

Pre-Workshop Preparation:(Highly Recommended)

  1. Go through 'Data dictionary.pdf' and 'Key definitions.pdf'.

  2. Guide to using a Data Dictionary: https://medium.com/@vinitasilaparasetty/guide-to-using-a-data-dictionary-1e4c683a2100

  3. Guide to Handling Missing Values: https://medium.com/@vinitasilaparasetty/guide-to-handling-missing-values-in-data-science-37d62edbfdc1

  4. Guide to Data Cleaning: https://medium.com/@vinitasilaparasetty/guide-to-data-cleaning-for-data-science-53e056c8eb98

  5. Guide to Exploratory Data Analysis: https://medium.com/swlh/guide-to-exploratory-data-analysis-for-data-science-294baff8b741

Additional Resources: (Optional)

  1. Types of Statistics used in Data Science - Part 1: https://medium.com/@vinitasilaparasetty/types-of-statistics-used-in-data-science-part-1-8bf8b2500552

  2. Types of Statistics used in Data Science - Part 2: https://medium.com/@vinitasilaparasetty/types-of-statistics-used-in-data-science-part-2-5bbe7a221e95

  3. Primer on Quartiles: https://medium.com/@vinitasilaparasetty/quartiles-for-beginners-in-data-science-2ca5a640b07b

  4. Data Structures: https://medium.com/@vinitasilaparasetty/data-structures-in-data-science-4f47d9c4ab94

Open In Colab

Permissions

You may make copies of the contents of this repo, but kindly mention me in the credits.

Kindly mention Atanas Stefanov and Vladislav Mihov, along with the original link to their dataset, if you use the dataset from this repo in your own projects.

About

My Workshop on EDA with Python for the ABCDE Conference (07-18-2020)

Resources

License

Stars

Watchers

Forks

Packages

No packages published