Creating a Spotify API Web Application to Analyze Taste in Popular Music by Country

Table of Contents

- Goal

- Questions

- Methods
             
- Hypothesis
           
- Similarity in Track Sets

- Similarity in Genres

- Similarity in Features

- Distribution of Features

- Null Hyothesis Test

- Stones Left Unturned:
    
- Path Forward:

Goal:

Explore Spotify's datasets to gain an understanding of the features that their apps use to classify audio tracks and tailor its music reccomendations to users.

Question:

How similar or different is the popular music in different countries/regions?

Methods:

Analyze the current "Top 50" Tracks of the United States, Canada, Mexico, the United Kingdom, and the Globe. Calculate the similarities using the following metrics:

    - Similarity in Popular Tracks
    
    - Similarity in Popular Genres
    
    - Similarity in the Features of Popular Music (aka the essential musical/audio
      charachteristics of Popular Tracks)

Hypothesis: The USA is the country whose "Top 50" tracks are the most similar to those of the Global "Top 50"

Similarity in Track Sets

Similarity in Genres

Use the scikit.learn vectorization module to take the lists of genres for each playlist and calculate the frequency of each genre. Then, calculate the cosine-similarity between every playlist's genre-vector, and create a similarity matrix. Finally, plot the matrix using a heatmap to visualize which playlists are most similar in their genres.

Similarity Matrix:

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	global	usa	uk	mex	can
global	1.000000	0.208407	-0.863271	-0.220740	0.488657
usa	0.208407	1.000000	-0.363109	-0.707902	0.220003
uk	-0.863271	-0.363109	1.000000	0.192726	-0.562035
mex	-0.220740	-0.707902	0.192726	1.000000	-0.660480
can	0.488657	0.220003	-0.562035	-0.660480	1.000000

The heatmap shows us that the two most similar playlists (whose intersection is the darkest shade of blue) are USA and Canada. However, contrary to my prediction, the playlist most similart to the global playlist is Canadas

Similarity in Features

Description of Features & Correlation between Features

Calculate Similarity

Take the mean values for every feature in a playlist. Then, use these vectors to once again calculate the cosine similarity between each playlist.

Cosine Similarity Matrix

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	global	usa	uk	mex	can
global	1.000000	0.208407	-0.863271	-0.220740	0.488657
usa	0.208407	1.000000	-0.363109	-0.707902	0.220003
uk	-0.863271	-0.363109	1.000000	0.192726	-0.562035
mex	-0.220740	-0.707902	0.192726	1.000000	-0.660480
can	0.488657	0.220003	-0.562035	-0.660480	1.000000

Distributions of Track Features

Null Hypothesis: There is no difference in the means of features in the USA and Global Playlists

two_tailed_test(global_df, usa_df, label1='Global', label2='USA', feature='acousticness')

pval = 0.612864976409074
fail to reject null hypothesis

two_tailed_test(global_df, usa_df, label1='Global', label2='USA', feature='danceability')

pval = 0.9898624889912536
fail to reject null hypothesis

two_tailed_test(global_df, usa_df, label1='Global', label2='USA', feature='energy')

pval = 0.9966979993191145
fail to reject null hypothesis

two_tailed_test(global_df, usa_df, label1='Global', label2='USA', feature='loudness')

pval = 0.6585050655009175
fail to reject null hypothesis

two_tailed_test(global_df, usa_df, label1='Global', label2='USA', feature='speechiness')

pval = 0.7483005130667619
fail to reject null hypothesis

Stones Left Unturned:

    - Which country most INFLUENCES the top 50?
    
    - Which Features most INFLUENCE the top 50

Path Forward:

    - Expand the Datasets and use Machine Learning to Predict
      the popularity/ranking of a track.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
data		data
njtest		njtest
notebooks		notebooks
plots		plots
slides		slides
README.md		README.md
capstone.ipynb		capstone.ipynb

jakemull13/spotify_musical_tastes_analysis

Folders and files

Latest commit

History

Repository files navigation