Skip to content

A machine learning program that generates a new song that will match input text from the user.

License

Notifications You must be signed in to change notification settings

notAFK/oxfordhack-2016

Repository files navigation

AutoDJ - Oxford Hack 2016

from kitchen import coffee

neuraln msbadge

license winbadge


Description

Inspiration

Lots of people are creative enough to write awesome lyrics, but they lack the musical knowledge to create songs with them. What if we had a program to generate a good song depending on the text input? That's exactly what AutoDJ does.

What it does

Using Microsoft's Text Analysis API from Cognitive Services, we analyse the key-phrases of the input text given by the user. We also look if it is happy / sad, using the sentiment analysis feature of the API. We compare the resulted features against a large collection of song lyrics, using the cosine similarity measure. The first few best matches are then fed into a Restricted Boltzmann Machine that generates a new song from them.

How we built it

First we downloaded a large number of popular songs from YouTube (the instrumentals only) and their lyrics separately from different websites, we parse the midi and lyrics files then index them in a JSON database. After that we have a script that calls the Microsoft's Text Analysis API and gets the key phrases for the user input, and compares it using the cosine similarity measure with the key phrases from all the lyrics indexed. We take the instrumental songs of the best matches (encoded as .midi) and feed them into the neural network that generates a new .midi with similar sounding.

Challenges we ran into

It's very hard to generate new music, as this is a cutting-edge current research topic. It is also very hard (and thus unreliable) to convert .mp3 into .midi. We had some troubles doing that, and even now it's not too accurate. Of course we also had the usual issues with permissions, dependencies, ƲƮƑ-8, etc., but who doesn't have those ?


Documentation

CONFIG.json

{
  "MIDI_PATH": "PATH/TO/ALL/THE/MIDI/FILES/",
  "LYRI_PATH": "PATH/TO/ALL/THE/LYRI/FILES",
  "MS_CS_API_KEY": "<INSERT KEY HERE>",
  "INPUT_FILE": "USERINPUTFILE.txt"
}

By default the MIDI_PATH is inside data/midi/ and the LYRI_PATH is data/lyri/.

The MS_CS_API_KEY is the 32 character long string containing both letters and numbers provided by the Microsoft API. This key can be found on My Account page and should look something like this: myaccapikey

The default input file for the user input (transferred using PHP from the main website) is stored in input1.txt.

youtubeScrapper

Praesent tincidunt accumsan orci vel eleifend. Vestibulum et luctus purus. Vestibulum eu rhoncus enim. Donec pretium posuere scelerisque. Nam nec tellus orci. Ut ac magna tempor, tincidunt nulla eu, pretium elit. Vestibulum faucibus neque sed neque rutrum, a dignissim diam finibus. Vivamus quis consectetur neque.

indexlink

The indexer.py in conjuction with makeuplink.py create the INDEX.json which stores all the lyric names, artist and the sentiment score offered by Microsoft Cognitive Services, Text Analytics API. The indexer script can be run with multiple arguments:

  • python indexer.py midi which goes trough the specified MIDI_PATH and normalizes all file names.
  • python indexer.py lyri which goes trough the specified LYRI_PATH and normalizes all file names, after that it removes any white spaces and empty lines from inside the lyrics file.
  • python indexer.py index calls the indexer in production mode, meaning that it will read all training data then call the Microsoft API and index: the lyrics filename with their appropiate sentiment score and the coresponding midi file. All indexed data is stored in JSON format, representing Python Dictionary objects.

INDEX.json

The structure of the JSON INDEX is as follows:

{
  "score": 0.4,
  "hash": "4b3510047885e8d8a5faff9ce821ee234d5cfd8680aae44ba40e5f749637f8cf",
  "filename": "awesome-song-by-mozzart-2016.lyri",
  "sentiment": "sad",
  "midi": "/SOMEPATH/oxfordhack-2016/data/midi/awesome-song-by-mozzart-2016.midi"
}

The score represents the sentiment value given by Microsoft Text Analytics API, from which we obtain sad or hpy. The hash is associated to the .lyri file, as the hash is a SHA256 hash of the content of the lyri file. The filename points to the origin of the lyrics while the midi represents the location of the midi instrumental.

csim - Cosine Similarity

Pellentesque viverra nunc vel nisi viverra porta. Aliquam dolor quam, sodales et arcu eget, posuere hendrerit magna. Vestibulum non rhoncus est. Pellentesque ullamcorper nibh a mi finibus volutpat. Donec facilisis quam massa, eget tincidunt tortor pretium vel. Aliquam erat volutpat. Mauris elementum turpis ut dui venenatis, eget porttitor eros faucibus.

rbm - Restricted Boltzmann Machine

In ligula massa, dignissim a sapien vitae, ornare dignissim leo. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Curabitur nec lectus libero. Sed metus sapien, interdum non porttitor in, mattis et nisl. Praesent nec tincidunt nisi. Vivamus volutpat urna id rhoncus eleifend. Mauris ligula urna, sollicitudin quis erat in, pulvinar blandit tellus. Donec felis nibh, sagittis nec feugiat at, gravida id mauris. Phasellus sed odio at urna bibendum hendrerit nec nec augue. Donec sit amet sodales lectus. Quisque posuere sapien vitae ex aliquam tincidunt. Maecenas ut odio sit amet sapien dapibus gravida et eu felis.

References

Microsoft API

Microsoft Cognitive Services, Text Analytics API is used by sending text data (lyrics) to the Microsoft API then receiving a respons containing: keywords (obtained using Natural Language Processing Natural Language Processing), language and sentiment score (represented by a numeric score between 0 and 1). Scores close to 1 indicate positive sentiment and scores close to 0 indicate negative sentiment. Sentiment score is generated using classification techniques.

Devpost Submission - Devpost Link

The Devpost Page was created for the Oxford Hack 2016, 24 hackathon.

License - !AFK

The !AFK License is a derivative of Simple Machine License

Video - Video Link

The video on YouTube, contains a demonstration and explanation (5 minutes long) of how the program works.