champagne anarchist | armchair activist

Quitting Facebook

Last month, data scientist Vicki Boykis posted an interesting article about the kind of data Facebook collects about you. It’s one of those articles that make you think: I really should delete my Facebook account - and then you don’t.

One could argue that Google search data illustrates how people relate to Facebook. People know Facebook isn’t good for them, but they can’t bring themselves to quit. However, when it’s time for New Year’s resolutions, they start googling how to delete their account.

UPDATE - Vicki Boykis just suggested to label major news events. In the past Google Trends had a feature that did just that, but I think they killed it. Of course, you can still do Google or Google News searches for a particular period. As a start I added two stories that may have contributed to the mid–2014 peak. Let’s see if other people come up with more.


Note that the Google search data is per week so each data point really refers to the week starting at that date.

I wanted to do a chart like this in December last year, which would perhaps have been a more appropriate moment. However, I didn’t get consistent data out of Google Trends using search terms like quit facebook. The other day, after deleting my own Facebook account, I realised I had probably used the wrong search term. People don’t search for quit facebook but more likely for delete facebook - they’re looking for technical advice on how to delete their account.


New Python package for downloading and analysing street networks


The image above shows square mile diagrams of cyclable routes in the area around the Stationsplein in Amsterdam, the Hague, Rotterdam and Utrecht. I made the maps with OSMnx, a Python package created by Geoff Boeing, a PhD candidate in urban planning at UC Berkeley (via).

Square mile diagrams are a nice gimmick (with practical uses), but they’re just the tip of the iceberg of what OSMnx can do. You can use it to download administrative boundaries (e.g. the outline of Amsterdam) as well as street networks from Open Street Map. And you can analyse these networks, for example: assess their density, find out which streets are connections between separate clusters in the network, or show which parts of the city have long or short blocks (I haven’t tried doing network measure calculations yet).

Boeing boasts that his package not only offers functionality that wasn’t (easily) available yet, but also that many tasks can be performed with a single line of code. From what I’ve seen so far, it’s true: the package is amazingly easy to use. All in all, I think this is a great tool.

Amsterdam’s most irritating traffic light is at the Middenweg

Red and orange dots show locations of irritating traffic lights. If any comments have been submitted, the dot is red. Click on a red dot, or type a few letters below, to see comments about a particular crossing (comments are mostly in Dutch).

Amsterdam’s most irritating traffic light is at the crossing of Middenweg and Wembleylaan, according to a poll among cyclists. The Amsterdam branch of cyclists’ organisation Fietsersbond says the top 10 most irritating traffic lights are well-known problem sites.

Comments made by participants in the poll show that cyclists are not just annoyed about long delays; they are also concerned about safety, especially at locations where many (school) children cross the street. Some cyclists nevertheless keep their spirits up: Plenty of time for an espresso there!!

Here are the ten most irritating traffic lights:

  1. Middenweg / Wembleylaan
  2. Amstelveenseweg / Zeilstraat
  3. Middenweg / Veeteeltstraat
  4. Rozengracht / Marnixstraat
  5. Meer en Vaart / Cornelis Lelylaan Nz
  6. IJburglaan / Zuiderzeeweg
  7. mr Treublaan / Weesperzijde
  8. Frederiksplein / Westeinde
  9. Nassauplein / Haarlemmerweg
  10. Van Eesterenlaan / Fred Petterbaan

Some are at routes where the city gives priority to car circulation, at the expense of cyclists and pedestrians. However, cyclists say they frequently have to wait at red lights even though the crossing is empty. This could be a result of budget cuts on maintenance of the systems that detect waiting cyclists.

Quite a few cyclists complained about cars running red lights (perilous!) or blocking the crossing. Further, not everybody is happy with crossings where all cyclists simultaneously get a green light. Such a set-up is nice if you have to make a left turn, for it will spare you having to wait twice, but it may result in chaos.

The Fietsersbond wants traffic lights adjusted to create shorter waiting times for cyslists and pedestrians. Research by DTV consultants found that adjusting traffic lights is a simple and cheap way to improve the circulation of cyclists, and that it also improves safety.

An analysis of location data from cyclists’ smart phones found that there are traffic lights in Amsterdam where the average time lost exceeds 30 seconds.

Thank you to the Fietsersbond and to Eric Plankeel for their input; and to all cyclists who participated in the poll.


How much delay for cyclists is caused by traffic lights

Road segments near traffic lights

The other day I posted an article on how much time cyclists lose at traffic lights in Amsterdam. Someone asked if I can calculate what percentage of total time lost by cyclists is caused by traffic lights. Keep in mind that delays can be caused by traffic lights, but also by crossings without traffic lights, crowded routes and road surface.

Here’s an attempt to answer the question, although I must say it’s a bit tricky. Again, I’m using data from the Fietstelweek (Bicycle Counting Week), during which over 40,000 cyclists shared their location data. This time I’m using the data about links (road segments). For each link, they provide the number of observations, average speed and relative speed.

With this data, it should be possible to estimate what share of total delays occurs near traffic lights. But what is near? It’s to be expected that the effect of traffic lights is observable at some distance: people slow down while approaching a traffic light and it takes a while to pick up speed again after. But what threshold should you use to decide which segments are near a traffic light?

One way to address this is to look at the data. I created a large number of subsets of road segments that are within increasing distances from traffic lights, and calculated their average speed. For example, segments that are within 50m from a traffic light have an average speed of about 16 km/h. The larger group of segments that are within 150m have an average speed of about 17 km/h.

Judging by the chart, it appears that the effect of traffic lights is diminishing beyond, let’s say, 150m. You could use this as a threshold and then calculate that delays near traffic lights constitute nearly 60% of all delays.

However, there’s a problem. Even if a delay occurs within 150m of a traffic light, the traffic light will not always be the cause of that delay. I tried to deal with this by estimating a net delay, which takes into account how much delay normally occurs when cyclists are not near a traffic light (in fact, I used two methods, that have quite similar outcomes). Using this method, it would appear that over 20% of delays are caused by traffic lights.

Now, I wouldn’t want to make any bold claims based on this: these are estimates based on assumptions and simplifications (in fact, if you think there’s a better way to do this I’d be interested). That said, I think it’s fair to say that average bicycle speeds appear to be considerably lower near traffic lights and that it’s plausible that this may be the cause of a substantial share of delays for cyclists.

UPDATE - I realise that the way I wrote this down sort of implies that you could reduce delay for cyclists by perhaps 20% just by removing traffic lights, but that would of course be a simplification.


I used Qgis to process the Fietstelweek data. I used the clip tool to select only road segments in Amsterdam. I had Qgis calculate the distance of each segment and extract the nodes, which I needed to get the coordinates of the start and end points. Further processing was done with Python.

The dataset contains a relative speed variable (it is capped at 1, which means that it only reflects people cycling slower than normal, not faster). A relative speed of 0.8 would mean that people cycle at 80% of their normal speed. I calculated total delay at segments this way:

number of observations * (1 - relative speed) * distance / speed

You can then calculate delay at segments near traffic lights, as a percentage of the sum of all delays.

I tried to get an idea of how much of delay is actually caused by traffic lights, by estimating net delay. For this, I needed net relative speed. I used two methods to estimate this: 1. divide the relative speed of a segment by the median relative speed of all segments that are not near a traffic light; and 2. divide the speed of a segment by the median speed of all segments that are not near a traffic light.

Python code here.

DuckDuckGo shows code examples

Because of Google’s new privacy warning, I finally changed my default search engine to DuckDuckGo.[1] So far, I’m quite happy with it. I was especially pleased when I noticed they sometimes show code snippets or excerpts from documentation on the results page.

Apparently, DDG has decided that it wants to be «the best search engine for programmers». One feature they’re using are the instant answers that are sometimes shown in addition to the ‘normal’ search results. These instant answers may get their contents from DDGs own databases - examples include cheat sheets created for the purpose - or they may use external APIs, such as the Stack Overflow API. Currently, volunteers are working to improve search results for the top 15 programming languages, including Javascript, Python and R.

One could argue that instant answers promote the wrong kind of laziness - copying code from the search results page rather than visit the original post on Stack Overflow. But for quickly looking up trivial stuff, I think this is perfect.

  1. I assume the contents of the privacy warning could have been reason to switch search engines, but what triggered me was the intrusive warning that Google shows in each new browsers session - basically punishing you for having your browser throw away cookies.  ↩