While between projects at work, or when a project is at a pause, I’m keeping myself busy in Tableau, trying ideas and sharpening by data blending and calculation building skills. When you’re just playing around you need a dataset, of course. I decided to switch things up and move from baseball to beer. I grabbed the top-rated beers from each state as rated by Beer Advocate’s readers (up to 100 per state are listed), and went to work. I wasn’t looking for anything in particular, I just consider this a first draft, a jumping-off point for use in discovery.
No real surprises to anyone familiar with the craft beer industry. We love our big beers, we love our hops. Most of the whales (hard-to-get, nearly mythical beers) are double IPAs, barrel aged stouts, etc. People are still traveling to Waterbury to get their Heady Topper, even further north to snag Hill Farmstead, waiting in line at Treehouse for Julius and Good Morning, begging friends in Iowa to beer-mail them some Toppling Goliath.
But these very simple worksheets have inspired 3-4 different paths of inquiry that I plan to turn into true dashboards. I want to play with ratings in and out of style specifications (see the American IPA tab), so I need to generate a table of ABV specifications so every style has a reference point. I just brute-forced it on the “American IPA – Ratings” sheet. A total of 11% of IPAs are out of spec. Not really surprising with IPAs, as people are doing all sorts of crazy things with the hops and malt bills, but if you wind up at 8.6%, is it still an IPA? And don’t get me started on 4% IPAs. Are other styles guidelines violated more or less? We shall see.
I want to play more with geographical data, and now that Tableau 9.2 is out, I get to noodle around with the MapBox integration. If you look at the “American IPA by State” tab, you’ll see Georgia is second only to Washington with 23 of the top 100 rated beers (in state). That surprised me. I have a theory as to why that is, but Georgia specifically isn’t that important; I want to get further into heat maps of preferred styles by region and quality by region. Filter down to just the IPAs rated 4.25 or more, and you’re left with Massachusetts, California, and Vermont. No surprises there.
I need to run some of these numbers through R in different ways, to see what’s really behind the “Ratings by ABV” trend line. I mean, that’s a pretty tidy p-value, but just throwing more malt at a recipe (and the corresponding hops for balance) doesn’t make your beer better. What DOES correlate most with “good?” Of course, I don’t have a lot of independent variables with which to work, but it’s worth a shot. May have to find other sources of data, do some blending.
And as to goal one of this project, I finally have the hang of LOD calcs, and what a powerful tool. I remember back in the dark ages of… a year ago, jumping through all sorts of hoops, combining multiple calculations and formulas and groupings to try to tease out a specific ranking or average. Of course, I was working in small data in academia, so it wasn’t a daily problem, but there were worksheets I built that would have been much easier with LOD.
Now, it’s time to dig deeper.