Angelos Tzelepis

Data. Localization. Technology. Fun.

  • Work
    • Data
    • Localization
    • Other Professional
  • Play
    • Homebrew
    • Beer
    • Food
    • Booze
    • Troy
    • Music

Learning D3js visualization

December 24, 2015 Angelos Tzelepis Leave a Comment

I’ve been threatening to figure out D3.js visualizations for a while now, and I finally got around to digging in. While there are a couple decent step-by-step tutorials available online, and the documentation at D3 itself is good, there is no good quick-start guide, so the first couple hours were rough. For example, it’s not as easy as copying some viz code, trying to dig up the accompanying data files (which are not always provided), putting them in a local folder, and opening the html file to see what happens. Nope! Try that and you get a blank screen, if not error messages of some sort. It was only after some intense googling that I discovered that browsers don’t load local files because security, and yes, the JSON or CSV you just built yourself counts. So you either start your browser without security, or you run a local server using Python, MAMP, or some other dev tools. I happen to own a Desktop Server license, and I went for that first, but then I realized it’d be a lot easier to just run this in terminal:

1
python -m SimpleHTTPServer 8000

Instead of double-clicking on my practice files, I used localhost:8000 in the browser URL bar, navigated to my D3 Projects folder, clicked on the html link there, and boom! The files I thought were broken and wasted an hour trying to fix were in fact fine. It was security theater protecting me from myself. Good times, good times.

Of course, I hadn’t actually done any work so far, just troubleshooting. The goal is to figure out how to get my own data into a proper formats for the vizzes. First up, the Sankey. I was able to grab the source data file and figure it out: first you declare all the node names (source and target), and then you declare all the source>target links and their respective counts.

A simple example:
JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{"nodes":[
{"name":"Bob"},
{"name":"Ted"},
{"name":"Carol"},
{"name":"Alice"},
{"name":"New York"},
{"name":"Massachusetts"},
{"name":"Pennsylvania"}
],
"links":[
{"source":0,"target":5,"value":6},
{"source":1,"target":4,"value":15},
{"source":1,"target":5,"value":12},
{"source":1,"target":6,"value":3},
{"source":2,"target":4,"value":21},
{"source":2,"target":6,"value":7},
{"source":3,"target":5,"value":9},
{"source":3,"target":6,"value":14}
]}

There are seven nodes (numbered 0-6 in the order in which they’re declared): four people and three states. Node 0, Bob, is only linked to node 5, Massachusetts, with a value of 6. I have 6 Bobs from MA. Node 1, Ted, has links to all three states: 15 to New York (node 4), 12 to Massachusetts (node 5), and 3 to Pennsylvania (node 6). I have 30 total Teds. And so on. Of course you can have nodes in the middle too, if a node is both a target, then a source. I’m keeping it simple though, and here’s what that looks like (be sure to play with the interactivity, like tooltips, sliding the nodes, etc.):


 

It was fun to finally figure this out. At my previous gig, there was interest in creating these types of visualizations for the college's administration and other groups, for exploring flows like Undergraduate Degree -> Employment Field, or Undergraduate Field -> Graduate Field. There were a lot of catches around roll-ups however (minimizing nodes to reduce clutter), and the project never really got off the ground in any official capacity. As an example of too many nodes, here's some beer data I was playing with: 51 states, 96 styles, 3680 total beers. Way too many nodes on each side! But it was a valuable exercise in writing scripts in Excel and InDesign to turn spreadsheet data into JSON text, so it was worth it now that I've got those methods down.

 

On to Bubble Charts! I don't know why I bothered, because those dead simple to do in Tableau, but I was on a roll. The data sample at that link is a crazy, nested mess, but I was able to unwind it enough to figure out the data structure. The outer wrapping "name" has to be the name of the file you call from your script, and all the bubbles are in a "children" wrapper,  with "name" and "size" designations.

This code:

JavaScript
1
2
3
4
5
6
7
8
9
10
{"name": "chartname",
"children": [
  {"name": "item1", "size": 5},
  {"name": "item2", "size": 10},
  {"name": "item3", "size": 8},
  {"name": "item4", "size": 13},
  {"name": "item5", "size": 7},
  {"name": "item6", "size": 20}
]
}

Gives you this Bubble Chart:

Bubble charts with one group of six items

All six items are children of chartname, thus have the same color. If you want your items grouped in subclasses, you need to nest more names and more children. In the below example, the children of chartname are the Groups 1-3, and each of those groups has two items as children:

This code:

JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
{"name": "chartname", "children": [
{"name": "Group 1",
  "children": [
   {"name": "item 1", "size": 5},
   {"name": "item 2", "size": 10}]},
{"name": "Group 2",
  "children": [
   {"name": "item 3", "size": 8},
   {"name": "item 4", "size": 13}]},
{"name": "Group 3",
  "children": [
   {"name": "item 5", "size": 7},
   {"name": "item 6", "size": 20}]}
]}

Gives you this Bubble Chart:

Bubble charts with three groups of two items

Each group is automatically given its own shade. Why they're so similar, I have no idea, but I plan to dig in to formatting soon. Mind your braces, brackets, and commas! Stray ones or missing ones are nothing but trouble, and are the first thing to look for if your viz page comes up blank.

Here's a chart using the beer ratings data from this post. The circle sizes represent the number of beers reviewed in that style, and the colors represent the style groupings according to the 2008 BJCP guidelines, as I haven't reorganized my groupings to match the new 2015 list. In terms of data structure, the first children of the chart are the 26 style groupings, and each of them has the individual styles as children.

This post is 5 miles long already, but in quickly trying to recreate this in Tableau, I ran into the following bug:

After this video, I tried 51 marks (the states) and had the same problem. Then I started with a fresh generic Excel file with 25, then 50, then 75, then 100 items with text sizes of varying lengths, and had no problems. I don't know where the bug is yet, but with a real working XL file, I could not get all my bubbles to display.

Next step: figure out formatting. I'd like more control over colors, font contrast, etc. Also, figure out the data structures for more chart types, especially hierachical edge bundles, node-link trees, Voronoi maps, and choropleth maps.

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Google+ (Opens in new window)

Data D3js, Data, dataviz

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Who am I?

Business intelligence, data governance, visualization, localization, and content management. I also build websites and epubs.

Where am I?

  • 
  • 
  • 
  • 

What have I done?

  • Beer (2)
  • Booze (3)
  • Data (6)
  • Featured (1)
  • Food (10)
  • Homebrew (6)
  • Localization (2)
  • Music (1)
  • Other Personal (3)
  • Other Professional (2)
  • Troy (2)

Copyright © 2018 · Daily Dish Pro Theme on Genesis Framework · WordPress · Log in