Skip to content

Scripts to create Music Information Retrieval datasets from streaming services for singer identification tasks

Notifications You must be signed in to change notification settings

aitor-alvarez/MIR-song-dataset-collection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MIR-song-dataset-collection

The current scripts searches on iTunes API for artists provided in a list (see step 2 under instructions) and downloads a 30 second preview of each song from those artists.

The resulting dataset could be used for training Deep Learning models in singer identification tasks.

Instructions

  1. Clone the current repository: git clone https://github.com/aitor-alvarez/MIR-song-dataset-collection.git
  2. Create a file with the artists/performers list. The file should be named artists.txt and contain a single column with the first row (as header) named artist. Place all your artists' names below this header.
  3. Execute the following command to parse the list of artists and search iTunes API for their songs: python main.py -a artists.txt . This will result in the creation of a file named dataset.csv that will contain all songs from those artists.
  4. Feel free to edit dataset.csv if you want to exclude songs.
  5. To download the 30 second previews execute: python main.py -d
  6. All previews will be downloaded into a subfolder within this repository named songs/

Releases

No releases published

Packages

No packages published

Languages