Pages

Saturday, July 12, 2014

Build Your Bar Project:
What's In A Drink Name / Drink Name Generator

This post is a short-attention-span version of my previous post, plus a little gadget I wrote to generate drink names.

Recap


I have been looking at a data set containing names and recipes for ~17000 cocktails. The original goal of the project was to use publicly available recipe data to inform a decision of what drinks to buy for a small home bar. The project since spun out of control because I realized the wealth of interesting trivia in the data.

How Do People Name Their Cocktails


Conceptually it might be worthwhile to separate out drinks into several naming classes based on their popularity. There are popular drinks that can be found at nearly any bar worth its salt, there are specialty drinks which are often variants of a popular drink with ingredient tweaks, and there are unique drinks that are unlikely to be found at any bar and will require a recipe if you want one.

Each class is generally named differently. Popular drinks have readily recognizable names. Specialty drinks often have parts of a popular drink name with additional words indicating the variations involved. The unique drinks do not follow a pattern as closely as the other two, reflecting the heterogeneity of people creating these unique drinks.

In order to learn a bit about how words are distributed within drink names I collated all the drink names in my data set and performed some simple analyses. After clean-up, I made a word cloud containing the top 100 words used in drink names, sized according to their frequency. This analysis does not consider the popularity of the drink itself, but the data does contain some indication of popularity (e.g. more popular drinks are more likely to have multiple recipes with the same name).

Booze Cloud!
 
Click to embignify

An interesting follow up to this analysis would be to re-weight the word frequencies according to how popular the drinks they are found in are. This will make the data look more like what you might see on a cocktail menu.

Drink Name Generator


Part of the data exploration I calculated the frequencies of the top 500 words, as well as the distribution of the number of words in a cocktail name. The largest name in my data set was 8 words, with two being the most frequent.

I constrained my drink name generator to produce drink names between 2 and 5 words. The generator generates a random number between 0 and 1, it compares that number to a look-up table corresponding to the cumulative distribution function (cdf) of word number frequencies. That number of words is generated using the same method but from the cdf of word name frequencies.
The generator then returns your new drink name. The app was written using shiny, a way to write small apps using R, which I will probably talk about in a future post.

Here’s the generator!

Make sure to try a few, some are pretty funny, a few gems so far are "Monkey Nut Punch" and "Flaming Panty Sweat".
Post your best in the comments!

Click here to open the app in a new window

Chris

No comments:

Post a Comment