Clint Eastwood won the LAUGHTER contest

I’m not sure what this says about the audiences at US national party conventions, but among a sample of 16 speeches, Clint Eastwood’s was the one that elicited the most laughter (Rand Paul’s got most applause). Among the presidential candidates, Obama won the applause contest, while being about equally funny as Romney.

For the second lesson of Alberto Cairo’s online data visualisation course, we were asked to comment on and perhaps redesign this convention word count tool created by the NYT. I wouldn’t be able to do such a cool interactive thing myself (I got stuck in the jQuery part of Codeyear), so I decided to focus on differences between individual speeches instead.

First I needed the transcripts – preferably from one single source to make sure the transcription had been done in a uniform way. As far as I could find, Fox News has the largest collection of transcripts online. As a result, Republican speakers are overrepresented in my sample, but that’s ok because the key Democratic speakers are included as well.

I wrote a script to do the word count (I’m sure this could be done in a more elegant way). One problem with my script was that html-code got included in the total word count. I thought I could correct this by subtracting 1,000 from each word count, but this didn’t work so well, so I had to make some corrections.

This assignment was a bit of a rush job so I hope I didn’t make any stupid mistakes.

Data visualisation course assignment

As part of Alberto Cairo’s data visualisation course, we’ve been asked to take a look at this graphic of social media use in selected countries and see how it can be improved. What struck me most (although this probably would not surprise social media experts) is the high level of activity in emerging economies. Above is my reinterpretation of the data. As a general indicator of social media use, I calculated the average of the listed types of social media use (upload photos; upload videos; manage profile; blogging; microblogging). Note that the data are from 2009.


How did Tinkebell obtain all that personal information

Almost three years ago, artists Tinkebell and Coralie Vogelaar published the book Dearest Tinkebell, in which they revealed the identity, photos, addresses and all sorts of embarrassing personal information about people who had sent hate mail to ‘cat murderer’ Tinkebell. The book is again drawing attention because of an article in the Guardian.

How did Tinkebell go about investigating the people who had made threats against her? “By checking whether the email addresses were registered at other websites as well, she could easily discover the identity of many of the people who had made threats against her”, the Volkskrant wrote. In this way, she got access to ‘Facebook profiles, Amazon wish lists and Youtube accounts’.

Of course, it wasn’t as easy as the Volkskrant suggests. In a supplement to the book, Vogelaar describes five steps to find out the identity of a mailer. Step 1 simply consists in googling the email address. “Often this only resulted in comments on blogs and sometimes a small profile but rarely in a full name.”

Apparently, the interesting information didn’t usually surface until step 2, in which the email addresses were linked to the Rapleaf database (steps 3 to 5 are mainly about verifying the information). When Tinkebell and Vogelaar published their book, nobody had heard about that company. That changed in 2010, when the Wall Street Journal created a bit of a fuss with a series of articles on the trade in personal information, under the title ‘What they know’.

One of the main companies active on this market is Rapleaf, which at the time claimed it had one billion email addresses at its disposal. These addresses are linked with data on your social network activity, your purchases and other information. In this way, the company builds a detailed profile of you. A spokesperson said at the time that Rapleaf never reveals people’s names to clients, but Vogelaar and Tinkebell had already shown that you can easily obtain someone’s identity with the data provided by the company – and much more.


Catch up on Codeyear

I got an email from Codeyear: “Still want to learn to code? We’ll help you catch up!”. Thank you kindly, but I’m pretty much on schedule. I think these people need your help more...