I am a bit nervous about doing this, but since several people asked, here goes: You can now download the travel-time map of the Netherlands I made in Processing. I have exported applications for Linux, Mac OS X and Windows. Each download includes the source files, but not the data file. For that, you will need to head to Alper’s site (he’s the guy who pulled the data from 9292 and ANWB). I hope you’ll enjoy playing around with this, or learn something from the way it was put together.
Some notes, in no particular order:
Please remember I am not a programmer. The vast majority of this sketch was put together from bits and pieces of code I found in books and online. I have tried to credit all the sources in the code. The full write-up I posted earlier should point you to all the sources too. In short; all the good bits are by other people, the bad code is mine. But who cares, it’s the end-result that counts (at least for me).
Related to the previous point is the fact that I cannot figure out under which license (if any) to release this. So the usual CC by-nc-sa license applies, as far as I’m concerned.
If this breaks your computer, offends you, makes you cry, or eats your kittens, do not come knocking. This is provided as is, no warranties whatsoever, etc.
Why am I nervous? Probably because for me the point of the whole exercise was the process, not the outcome.
Subscribers to my Flickr stream have probably noticed a number of images of some kind of map flowing past lately. They were the result of me tracking my progress on a pet project. I have more or less finished work on it this week, so I thought I’d detail what I did over here.
Following my Twitter dataviz sketches, I thought I’d take another stab at prototyping with Processing. On the one hand I wanted to increase my familiarity with the environment. On the other, I continued to be fascinated with data-visualization, so I wanted to do another design exercise in this domain. I was particularly interested in creating displays that assist in decision making and present data in a way that allows people to ‘play’ with it — explore it and learn from it.
The seed for this thing was planted when I saw Stamen’s work on the mySociety travel-time maps. I thought the idea of visually overlaying two datasets and allowing the intersection to be manipulated by people was simple but powerful. But, at that time, I saw no way to ‘easily’ try my hand at something similar. I had no ready access to any potentially interesting data, and my scraping skills are limited at best.
Luckily, I was not the only one whose curiosity was piqued. After seeing Ben Cerveny demoing the same maps at The Web and Beyond 2008, Alper wondered how hard it would be to create something similar for the Netherlands. He presented a way to do it with freely available tools and data (to an extent) in a workshop at a local unconference.1
I did not attend the event, but after seeing his blog post, I sent him an email and asked if he was willing to part with the data he had collected from the Dutch public transport travel planning site 9292. Alper being the nice guy he is, he soon emailed me a JSON containing of the data.
So that’s the background. I had an example, I had some data, and I had a little experience with making things in Processing.
The first step was to read the data in the JSON file from Processing. I followed the instructions on how to get the JSON library into Processing from Ben Fry’s book (pages 315–316). On the Processing boards, a cursory search unearthed some code examples. After a little fiddling, I got it to work and could print the data to Processing’s console.
Next up was to start visualizing it. I used the examples of scatterplot maps in Visualizing Data as a starting point, and plugged in the JSON data. Pretty soon, I had a nice plot of the postal codes that actually resembled the Netherlands.
From there, it was rather easy to show each postal code’s travel time.2 I simply mapped travel times to a hue in the HSB spectrum. The result nicely shows colored bands of travel-time regions and also allows you to pick out some interesting outliers (such as Groningen in the north).
At this point, I wanted to be able to select travel-time ranges and hide postal codes outside of that range. Initially, I used the keyboard for input. This was OK for this stage of the project, but of course it would need to be replaced with something more intuitive later on. In any case, I could highlight selected points and dim others, which increased the display’s explorability considerably.
The HSB spectrum is quick and easy way of getting access to a full a range of colors. It served me well in my Twitter visualizations. However in this case it left something to be desired, aesthetically speaking. Via Tom Carden I found the wonderful cpt-city, which catalogues gradients for cartography and the like. Initially I struggled with ways to get these colors into Processing, but then it turned out you could easily read out the colors of pixels from images. This allowed me to cycle through many palettes just by adding the files to my Processing sketch. I discovered that a palette with a clear division in the middle was best, because that provides you with an extra reference point besides the beginning and end.
I next turned to the interaction bits. I knew I wanted a so-called dual slider that would allow people to select the upper and lower limit of travel time. In the Processing book, there is code for plenty of interface widgets, but sadly no dual slider. I looked around on the Processing board and could find none either, to my surprise. Even in the UI libraries (such as controlP5 and Interfascia) I could not locate one.
So I decided to lower the bar and first include two horizontal sliders, one for the upper and one for the lower limit. These I made using the code on pages 448–452 of the Processing book. Not perfect, but an improvement over the keyboard controls.
Selecting, yet again
Next, I decided I’d see if I could modify the code of the horizontal scrollbar so that I would end up with a dual slider. After some messing about (which did increase my understanding of the original code considerably) I managed to get it to work. This was an unexpected success. I now had a decent dual slider.
So far there was no way of telling which point corresponded to which postal code. So, I added a rollover that displayed the postal code’s name and travel time. At this point it became clear the data wasn’t perfect — some postal codes were erroneously geocoded by GeoNames. For instance, code 9843 (which is Grijpskerk, 199 minutes to the Dam) was placed on the map as Amsterdam Noord-Oost!
Adding more data
Around this point I visited Alper in Delft and we discussed adding a second dataset. Although housing prices à la mySociety would have been interesting, we decided to take a different route and add a second travel-time set for cars.3 My first step in integrating this was to simply generate a map each for the public transport and car travel data and manually juxtapose them. What I liked about this was that even though you know intuitively that traveling by car is faster, the two maps next to each other provide a dramatic visual confirmation of this piece of knowledge.
Moving ahead with the extra data, I started to struggle with how to represent both travel times. My first effort was to draw two sets of dots on top of each other (one for car travel times and one for public transport) and color each accordingly. For each set I introduced a separate slider. I wasn’t very satisfied with the result of this. It did not help in understanding what was going on that much.
After discussions with Alper and several other people, I decided it would make more sense to show the difference between travel times. So I calculated the percentage difference between public transport and car travel time for each postal code. This value I mapped to a color. Here, a simple gradient worked better than the palettes used earlier for travel times.
I also discarded the idea of having two dual sliders and simply went with one travel time selector. Although more user-friendly, it created a new problem: for some points both travel times would fall within the selected range, and for others one or the other. So I needed an extra visual dimension to show this. This turned out to be the greatest challenge.
After trying many approaches, I eventually settled on using the shape of the point to show which travel times fell within the range. A small dot meant that only the public transport travel time is within the range, a donut means only the car travel time is selected, and a big dot represents selection of both times.
Around this point I felt that it was time to wrap up. I had learnt about all I could from the exercise and any extra time spent on the project would result in marginal improvements at best. I added a legend for both the shapes and color, improved the legibility of the rollover and increased the visual affordance of the slider, and that was it.
It is becoming apparent to me that the act of building displays like this is playful in its own way. Through sketching in code, you can have something like a conversation with the data and get a sense of what’s there. Perhaps the end result is merely a byproduct of this process?
I’m amazed at how far a novice programmer like myself, with a dramatic lack of affinity for anything related to mathematics or physics, can get by simply modifying, augmenting and combining code that is already out there. I have no ambition whatsoever of becoming a professional developer of production-quality code. But building a collection bits and pieces of code that can do useful and interesting things seems like a good strategy for any designer. I am learning to trust my innate reluctance to code stuff from scratch.
Also, isn’t it cool that it is becoming increasingly feasible for regular citizens to start analyzing data that is — or at least should be — publicly available? Government still has a long way to go. Why do we need to go through the painstaking process of scraping this data from sources such as 9292 which for all intents and purposes is a public service?4
I will probably make the final prototype available online at some point in the future. For now, if you have any questions or comments I would love to hear them here, or via email.
When the NLGD Foundation invited me to speak at their anual Festival of Games I asked them what they would like me to discuss. “Anything you like,” was what they said, essentially. I decided to submit an abstract dealing with data visualization. I had been paying more and more attention to this field, but was unsuccessful in relating it the other themes running through my work, most notably play. So I thought I’d force myself to tackle this issue by promising to speak about it. Often a good strategy, I’ve found. If it worked out this time I leave for you to judge.
In brief, in the presentation I argue two things: one — that the more sophisticated applications of interactive data visualization resemble games and toys in many ways, and two — that game design can contribute to the solutions to several design issues I have detected in the field of data visualization.
Below are the notes for the talk, slightly edited, and with references included. The full deck of slides, which includes credits for all the images used, is up on SlideShare.
Hello everyone, my name is Kars Alfrink. I am a Dutch interaction designer and I work freelance. At the moment I work in Copenhagen, but pretty soon I will be back here in Utrecht, my lovely hometown.
In my work I focus on three areas: mobility, social interactions, and play. Here is an example of my work: These are storyboards that explore possible applications of multitouch technology in a gated community. Using these technologies I tried to compensate for the negative effects a gated community has on the build-up of social capital. I also tried to balance ‘being-in-the-screen’ with ‘being-in-the-world’ — multitouch technologies tend to be very attention-absorbing, but in built environments this is often not desirable.1
I am not going to talk about multitouch though. Today’s topic is data visualization and what opportunities there are for game designers in that field. My talk is roughly divided in three parts. First, I will briefly describe what I think data visualization is. Next, I will look at some applications beyond the very obvious. Third and last, I will discuss some design issues involved with data visualization. For each of these issues, I will show how game design can contribute.
Sketching is the defining activity of design writes Buxton and I tend to agree. The genius of his book is that he shows sketching can take on many forms. It is not limited to working with pencils and paper. You can sketch in 3D using wood or clay. You can sketch in time using video, etc. Buxton does not include many examples of sketching in code, though.1 Programming in any language tends to be a hard earned skill, he writes, and once you have achieved sufficient mastery in it, you tend to try and solve all problems with this one tool. Good designers can draw on a broad range of sketching techniques and pick the right one for a given situation. This might include programming, but then it would need to conform to Buxton’s defining characteristics of sketching: quick, inexpensive, disposable, plentiful, offer minimal detail, and suggest and explore rather than confirm.
I have been spending some time broadening my sketching repertoire as a designer. Before I started interaction design I was mostly into visual arts (drawing, painting, comics) so I am quite comfortable sketching in 2D, using storyboards, etc.2 Sketching in code though, has always been a spot. I have started to remedy this by looking into Processing.
As an exercise I took some data from Twitter — one data set was the 20 most recent tweets and the other my friends list — and decided to see how quick I could create a few different visualizations of that data. The end results were:
one: a timeline that spatially plots the latest tweets from my friends — showing density at certain points in time; or how ‘noisy’ it is on my Twitter stream,
two: an ordering of friends based on the percentage of their tweets that take up my timeline — who’s the loudest of my friends?,
three: a graph of my friends list, with number of friends and followers on the axes and their total number of tweets mapped to the size of each point.
The aim was not to come up with groundbreaking solutions, or finished applications.3 The goal was to exercise this idea of sketching in code and use it to get a feel for a ‘complex’ data set, iterating on many different ways to show the data before committing to one solution. In a real-world project I could see myself as a designer do this and then collaborate with a ‘proper’ programmer to develop the final solution (which would most likely be interactive). I would choose different sketching techniques to design the interactive aspects of a data-visualization. For now I am content with Processing sketches that simply output a static image.
Tools & resources used were:
Processing — a free and open source programming environment for visual folks
The Twitter4J library for working with the Twitter API
If as a designer you are confronted with a project that involves making a large amount of data understandable, sketching in code can help. You can use it to ‘talk’ to the data, and get a sense of its ‘shape’.
There is one involving Phidgets and Max/MSP, a visual programming solution for physical computing. [↩]