champagne anarchist | armchair activist

Web scrapping

Search volume for web scrapping and web scraping according to Google Trends (52-week moving average).

Search volume for web scrapping as percentage of volume for both terms, according to Google Trends (52-week moving average).

I came across the following quote on the Web Scraping website:

I searched my email and found over the last few years I received 76 messages from clients containing the text Web Scrapping rather than the usual spelling Web Scraping. And this is not unique to my clients - currently Google has 122,000 results for “Web Scrapping” compared to 447,000 results for “Web Scraping” - the correct spelling returns only 4x the number of results. So in light of this common spelling mistake I registered the domain and redirected it here.

I thought this had to be a joke - but it wasn’t. The domain redirect actually works, and there does appear to be a persistent search volume for web scrapping, even if its share of total web scra[p]{1,2}ing searches has declined considerably.


The moralism and hypocrisy around ad blockers

With my new iPhone, I can finally install ad blockers. When I tried to find information about the available options, I was struck by the moralism and hypocrisy of many articles on the subject. This subtitle says it all: How to use ad-blockers in iOS 9 (and why you shouldn’t).

Sure, the article makes some valid points. One may question Apple’s motives for allowing ad blockers. And certainly, one may question Adblock’s policy to allow «acceptable ads» from companies that pay them a fee (so use an alternative like the open source uBlock Origin instead). But the claim that ad blocking could «kill journalism as we know it» seems a bit over the top.

The advertising industry tries to frame ad blocking as an attack on «the little guy», by which they mean small, independent publishers. Their strategy is similar to the Home taping is killing music campaign of the 1980s, by which the music industry tried to make us believe that home taping was bad for musicians. In reality, home taping was killing the profits of the very industry that was exploiting those musicians in the first place.

Journalists should be paid for their work, but I’m not convinced advertising is the solution. Ads are annoying, they slow down the internet, they waste valuable surface on mobile screens, they often come with scripts that track you and sometimes they spread malware. Perhaps even more importantly: ideally, journalists shouldn’t depend on advertising in the first place, because advertising is killing independent journalism.

So how should journalists get paid? I’m not sure there’s an easy answer. One way is to pay collectively, which may work rather well (BBC), but it does entail some degree of state regulation. Another way is to buy subscriptions from each site or publisher who publish interesting articles - but that’s rather cumbersome.

A practical alternative are subscription services like Blendle - described as the «the Netflix or Spotify for journalism» (although it’s more like iTunes in that you pay per article). Blendle is an interesting initiative, but there’s reason for caution.

If successful, services like Blendle may well develop into large corporations that try to control access to news stories - much like Spotify tries to control access to music (and Facebook tries to control access to news stories). The outcome could be that subscription services become profitable by exploiting journalists. Also, subscription services could amass an unhealthy degree of control over what we read, and could introduce similar opaque algorithms as the ones Facebook uses to decide what content we get to see.

Things might get interesting if journalists would draw inspiration from musicians and set up cooperatives. These could take the form of not-for-profit Blendle alternatives that offer independent quality journalism at a fair price, produced by journalists who are paid a fair wage for their work.

For now, ad blockers not only offer practical benefits; they also force the internet to address its unhealthy dependency on advertising.


Twitter discovers Steven Kruijswijk

Tweets mentioning names of riders, as percentage of all tweets with hashtag #giro, per day (smoothing applied). Data updated every hour (to update chart, clear browers history or click here to view the chart). Chart:

If all goes well, Steven Kruijswijk might just be the first Dutch rider in ages to conquer the podium in a large race, journalist Thijs Zonneveld wrote on Friday 20 May. At that point, Twitter hadn’t really discovered Kruijswijk yet. That changed on Saturday, when Kruijswijk won the pink jersey.


Strava wants my commute data

Dutch tv recently aired a fascinating documentary on the «smart city» phenomenon. Companies like Google are teaming up with local governments to further expand their already huge datasets on human behaviour, raising the spectre of total control and absence of privacy (someone used the word panoptical).

Proponents claim the smart city will make cities more efficient and perhaps even sustainable. But judging by the examples given by Amsterdam’s smart city czar (he was quoted in the same documentary), the main beneficients may well be motorists. Big data is used to help them navigate their car through the city and find a place to park it. In fact, only one out of Amsterdam’s ~100 smart city projects even mentions the word fiets (bicycle).

And now Strava wants my commute data. They’ve proclaimed tomorrow, the 10th of May, Bike to Work Day. If you’ll upload your ride to work, they promise to make your commutes count:

With data like this, cities can better understand how people choose to interact with the network of roads, bike paths and intersections. The result is improved decision-making, smarter planning, safer streets and more people biking, running and walking. Better data is a catalyst for change.

Bringing a bit more balance to the smart city phenomenon by adding lots of cycling data sounds like a good idea. But will it work? When a very similar initiative was run by Dutch cyclists’ organisation Fietsersbond, Bicycle Count Week, a critic argued that some of the worst bicycle infrastructure in Amsterdam can easily be identified without recording any rides. These problems remain unsolved not for lack of data, but for lack of political will.

Personally, I’d argue that data can be useful, if used critically. But I’m not sure the interpretation of data should be left to the smart city alliance of local governments and corporations.

So will I upload my ride to work tomorrow? To be honest, I’ll probably forget to record it in the first place.

The smart city documentary, part of VPRO’s Tegenlicht series, can be seen in Dutch here. The VPRO has translated some of its Tegenlicht (Backlight) documentaries, but I don’t think this one is available in English yet.


Embedding D3.js charts in a responsive website

For a number of reasons, I like to use D3.js for my charts. However, I’ve been struggling for a while to get them to behave properly on my blog which has a responsive theme. I’ve tried quite a few solutions from Stack Overflow and elsewhere but none seemed to work.

I want to embed the chart using an iframe. The width of the iframe should adapt to the column width and the height to the width of the iframe, maintaining the aspect ratio of the chart. The chart itself should fill up the iframe. Preferably, when people rotate their phone, the size of the iframe and its contents should update without the need to reload the entire page.

Styling the iframe

Smashing Magazine has described a solution for embedding videos. You enclose the iframe in a div and use css to add a padding of, say, 40% to that div (the percentage depending on the aspect ratio you want). You can then set both width and height of the iframe itself to 100%. Here’s an adapted version of the code:

.container_chart_1 {
    position: relative;
    padding-bottom: 40%;
    height: 0;
    overflow: hidden;

.container_chart_1 iframe {
    position: absolute;
    left: 0;
    width: 100%;
    height: 100%;

<div class ='container_chart_1'>
<iframe src='' frameborder='0' scrolling = 'no' id = 'iframe_chart_1'>

Making the chart adapt to the iframe size

The next question is how to make the D3 chart adapt to the dimensions of the iframe. Here’s what I thought might work but didn’t: in the chart, obtain the dimensions of the iframe using window.innerWidth and window.innerHeight (minus 16px - something to do with scrollbars apparently?) and use those to define the size of your chart.

Using innerWidth and innerHeight seemed to work - until I tested it on my iPhone. Upon loading a page it starts out OK, but then the update function increases the size of the chart until only a small detail is visible in the iframe (rotate your phone to replicate this). Apparently, iOS returns not the dimensions of the iframe but something else when innerWidth and innerHeight are used. I didn’t have that problem when I tested on an Android phone.

Adapt to the iframe size: Alternative solution

Here’s an alternative approach for making the D3 chart adapt to the dimensions of the iframe. Set width to the width of the div that the chart is appended to (or to the width of the body) and set height to width * aspect ratio. Here’s the relevant code:

var aspect_ratio = 0.4;
var frame_width = $('#chart_2').width();
var frame_height = aspect_ratio * frame_width;

The disadvantage of this approach is that you’ll have to set the aspect ratio in two places: both in the css for the div containing the iframe and in the html-page that is loaded in the iframe. So if you decide to change the aspect ratio, you’ll have to change it in both places. Other than that, it appears to work.

Reloading the chart upon window resize

Then write a function that reloads the iframe content upon window resize, so as to adapt the size of the chart when people rotate their phone. Note that on mobile devices, scrolling may trigger the window resize. You don’t want to reload the contents of the iframe each time someone scrolls the page. To prevent this, you may add a check whether the window width has changed (a trick I picked up here). Also note that with Drupal, you need to use jQuery instead of $.

width = jQuery(window).width;
    if(jQuery(window).width() != width){
        document.getElementById('iframe_chart_1').src = document.getElementById('iframe_chart_1').src;
        width = jQuery(window).width;

In case you know a better way - do let me know!

FYI, here’s the chart used as illustration in its original context.