Paste0 in R is one of the things that we learned about in this week’s videos for the Data Analysis course. I didn’t think much of it at the time, but I was wrong! I just learned about statistical computing’s most influential contribution of the 21st century!

My first D3 graph

I’m trying to master D3, a javascript library for creating (interactive) web graphics. As an excercise, I redid this graph, which uses Eurostat data on the percentage of the population who have ever written a computer programme.

I can’t say it’s a very good graph: some of the most intriguing aspects of the data have to do with changes over time (decline in some countries, rather large growth in Finland, implausible fluctuations in the Netherlands), which don’t show very well in my graph. Nevertheless, it feels good to have coded my first interactive D3 graph.

P.s. the graph may not be visible in older versions of internet explorer.

Clint Eastwood won the LAUGHTER contest

I’m not sure what this says about the audiences at US national party conventions, but among a sample of 16 speeches, Clint Eastwood’s was the one that elicited the most laughter (Rand Paul’s got most applause). Among the presidential candidates, Obama won the applause contest, while being about equally funny as Romney.

For the second lesson of Alberto Cairo’s online data visualisation course, we were asked to comment on and perhaps redesign this convention word count tool created by the NYT. I wouldn’t be able to do such a cool interactive thing myself (I got stuck in the jQuery part of Codeyear), so I decided to focus on differences between individual speeches instead.

First I needed the transcripts – preferably from one single source to make sure the transcription had been done in a uniform way. As far as I could find, Fox News has the largest collection of transcripts online. As a result, Republican speakers are overrepresented in my sample, but that’s ok because the key Democratic speakers are included as well.

I wrote a script to do the word count (I’m sure this could be done in a more elegant way). One problem with my script was that html-code got included in the total word count. I thought I could correct this by subtracting 1,000 from each word count, but this didn’t work so well, so I had to make some corrections.

This assignment was a bit of a rush job so I hope I didn’t make any stupid mistakes.