Last December I participated in the 2014 London Music Hack Day. These events are always a lot of fun, lots of great people hacking on lots of great things, a list of which you can find on Hacker League
My hack for this edition was a rhyming Twitterbot (@RhymeBustingBot) made using the MusixMatch and Echonest API’s and the Natural Language Toolkit (NLTK) Python library. The bot was MusixMatch’s favourite hack and won their hack day prize (thanks guys!)
Here’s how it works:
Here are two of my favorite rhymes that the bot has unearthed:
Pharcyde vs. House of Pain
ODB takes on the crown of the Fresh Prince
The bot was inspired by the ever awesome @Pentametron who’s writing the worlds longest poems using Tweets. Its rhyming skills are way better than the RhymeBustingBots, combine this with its use of meter and the random nature of the tweets and you get a pretty funny bot.
If you are technically minded you can take a look at the code that makes up the RhymeBustingBot on Github. Here’s a few words on how it works.
The NLTK includes the CMU pronouncing dictionary, and we use that to compare the pronounciations of the last word of each line. Starting from the end of each word, count back and see how the wowels and the consonants match up.
Let’s take a classic Biggie line to illustrate:
It was all a dream, I used to read Word Up! magazine
Salt-n-Pepa and Heavy D up in the limousine
Here’s the pronounciation of the two rhyming words according to the Carnegie Mellon dict
The letters outline the different sounds used and the numbers are used to indicate where the emphasis is placed. You can see how the n, iy, z and ah sounds match up in this instance and that the emphasis is the same on the wowels too. Based on this the rhyme finding algorithm assigns this rhyme a value of 4.
That is all good and well, but the bot does have a problem with false positives.1 Like this tweet, where the bot seems to think “bacon” and “woman” make a good rhyme. Maybe a drawling southern rapper could sell that as a great rhyme but to me it doesn’t really hold. The bot however, sees both these words as ending with an AH-N sound and therefore as a rhyme with a rating of two, making it the best rhyme found in these two songs:
I’m not sure how to improve on this, other than to start white- or blacklisting rhymes or provide the bot with some editorial oversight (which seems to defeat the whole point).
Regardless, putting this thing together was a fun project and got me a bit more familiar with natural language processing. Also, every once in a while, the bot stumbles upon a very random but good rhyme which always makes me chuckle.