Data

Translating polls into policy outcomes

In a report on last May’s Australian election, Nick Evershed of the Guardian translated live election results into support for specific policy outcomes.

We wanted to make an alternate view of election results that moves the results away from the ‘horse race’ and instead emphasises the policy outcomes of the election – that is, what the outcome will actually mean for people in the real world.

I reckoned you could do the same with polls instead of election results. I selected a number of proposals that have been put to a vote in the Dutch Lower House. Using Tom Louwerse’s Peilingwijzer ‘poll of polls’, I tracked developments in the combined virtual vote share of the parties that have voted in favour of those proposals.

The support for individual parties may show considerable fluctuations, but the combined support for policy proposals is relatively stable. This shouldn’t really come as a surprise. Voters may switch quite easily from one party to the other, but not randomly: they tend to stick to a set of parties with broadly similar values. This suggests that voters will often switch between parties that tend to support the same proposals.

Still, some proposals do show growth or decline in their combined virtual vote share. This includes proposals that were supported by either FvD or PVV but not both: FvD has seen considerable growth in the polls, partly at the expense of PVV. Proposals supported by left-wing parties also saw their support grow somewhat, but not if D66 was among the supporting parties.

So what does this all mean? The chart above doesn’t predict which policies will be implemented after the next election (just like the underlying polls aren’t simply a prediction of the next election result). However, it does appear to be a useful tool to make sense of fluctuations in polls.

Key to parties.

UPDATE 21 July 2019 - New Peilingwijzer data has been published since; the chart has been updated to include the fresh data. For the conclusions this doesn’t really make much of a difference.

Caveats

One could argue that how parties vote doesn’t always reflect their position, especially when coalition parties have to stick to concessions they have made in the coalition agreement. I dealt with this by using only proposals (with one exception) on which the coalition parties VVD, CDA, D66 and ChristenUnie did not vote unanimously. Apparently, there was no pressure to vote along coalition lines in these cases.

Voters (and respondents in polls) aren’t always aware of the positions of the parties they support. For example, many voters want the government to reduce income differences. They may (wrongly) assume that the party they support also wants to reduce income differences.

As for the chart: an area chart is always a bit problematic, but it would appear to be a defensible choice when you want to show developments in the combined vote share of a number of parties. I guess it could be improved by putting parties that show large variations in the polls on top and the more stable ones to the bottom.

Traffic flow maps

Traffic Diagram of London, by Ludwig Karl Hilberseimer, 1944. Ludwig Karl Hilberseimer Archive, Ryerson and Burnham Archives, The Art Institute of Chicago.

A tweet by artist, designer and developer Jill Hubley drew my attention to a traffic flow map of London, created by Bauhaus architect Ludwig Karl Hilberseimer. The map shows the number of buses passing through London’s central arteries in one hour. «The traffic diagram of London shows both the typical congestion in the center and the lack of transportation facilities at outlying points,» he commented. Hilberseimer thought the solution to this transportation problem was to decentralise the city, by creating satellite cities with a population of at most 100,000.

Hubley has been tweeting numerous historical traffic flow maps, including a beautiful 1944 map showing transport along the waterways of Belgium and the Netherlands. What struck me about the Hilberseimer map is its similarity to a series of maps in a traffic study published by the city of Amsterdam in 1976. These maps were discovered by Marjolein de Lange of cyclists’ organisation Fietsersbond, and have been reproduced in the book Bike City Amsterdam she wrote with Fred Feddes.

From: Voorontwerp verkeerscirculatieplan Amsterdam, 1976. Photo Marjolein de Lange

The Amsterdam maps illustrate how cycling had declined in Amsterdam between 1961 and 1971, and how rising car use had created a congestion problem. It wasn’t until later that the city developed measures to promote cycling, as analysed in Bike City Amsterdam. I tried to create a 2016 version of the cycling map using Fietstelweek data, but it should be noted that the cycling routes of Fietstelweek participants may not be representative of overall bicycle traffic in Amsterdam.

Compared to Hilberseimer’s map, the maps created by the city of Amsterdam have a very clean design: all cartographic details that do not represent traffic data have been omitted. And then there’s the elegant legend. Hubley has tweeted a Swedish traffic flow map from 1977 with a similar type of legend, as well as a map of Florida from 1952, a map of St. Paul - Minneapolis from 1949, and a 1945 map of eastern Germany with a horizontal version of the legend. (Update - interesting variant on this 1963 Lincolnton map; plus see this 1921 map of Seattle.) I wonder whether earlier examples exist.

Were the flow maps in Amsterdam’s traffic circulation plan inspired by Hilberseimer’s Traffic Diagram of London? Possibly, but Hilberseimer wasn’t the first to create a traffic flow map. In fact, both Amsterdam’s map makers and Hilberseimer are indebted to a map created a century before Hilberseimer’s map, by the French civil engineer Charles-Joseph Minard.

Carte de la circulation des voyageurs par voitures publiques sur les routes de la contrée où sera placé le chemin de fer de Dijon à Mulhouse (source), by Charles-Joseph Minard, 1845. Reproduced with the kind permission of the Ecole nationale des ponts et chaussées.

In her book The Minard System, visualisation strategist Sandra Rendgen comments:

In this revolutionary map, created in the middle of a debate about where to project the railroads between Dijon and Mulhouse in eastern France, Minard analyzed the street traffic on preexisting roads in the region.

Apparently, the map was so influential in shaping the debate that a fake copy was made ‘in an attempt to prove another route to be more promising’.

Rendgen describes how Minard initially created bar charts to represent traffic along segments of a route. At some point, he decided to project these graphs onto a map, which resulted in the creation of the flow map. Over time, Minard’s flow maps gained in complexity, as he used colour to represent different types of data. Minard is sometimes credited with inventing the flow map, but Rendgen points out that the design was possibly invented more or less simultaneously in Ireland, France and Belgium.

Minard’s charts and maps often contain detailed descriptions of the data and methods he used. He collected data from a range of sources, and emphasised that graphs should accurately represent the data. On the other hand, he was willing to sacrifice geographic detail or accuracy for clarity. Rendgen points to the ‘clean and minimalist aesthetics’ of his work, devoid of decorations or other clutter. It is no wonder that Edward Tufte, the renowned proponent of clutter-free data visualisation, described Minard’s work as an example of ‘graphical excellence’ (in The Visual Display of Quantitative Information).

A recurrent theme in Minard’s explanatory notes is that he aimed to make relationships quickly apparent to the eye. One of these notes has an almost futurist sense of modernity to it: «The figurative maps are thoroughly in the spirit of the century in which one seeks to save time in all ways possible.»

One could argue that the 1976 Amsterdam traffic flow maps are true heirs to Minard’s approach, and especially to his first, monochrome flow map reproduced above. As Rendgen notes, Minard’s map is «extremely stripped down; it features barely any landscape details other than a network of local place names and rivers». Even those subtle geographical hints have been omitted from the Amsterdam traffic flow maps. Of course, this only works because of the very recognisable pattern of Amsterdam’s streets.

Scraping Airbnb

Airbnb is not exactly keen to share data that might help analyse its impact on local housing markets. In 2016, the Amsterdam Municipality decided to collect Airbnb data using a scraper - a computer programme that automates the job of retrieving information from web pages.

Amsterdam is not the only government to use web scraping. Increasingly, this technique is used to obtain data about topics ranging from consumer prices to jobs vacancy statistics and business data. Collecting data from the internet has advantages, but it also poses some challenges. It may be difficult to aggregate data coming from different websites, and data found online may not cover all aspects of a phenomenon you’re trying to understand (for example, not all job vacancies are published online). On a more practical level, your web scraper code may break when websites change.

In March 2017, Amsterdam reported that its weekly scrapes of major platforms like Airbnb required little maintenance. But last week, it sent a report to the city council describing how Airbnb has been making changes to its website - perhaps in an attempt to frustrate efforts to collect information about its business practices. Initially, Amsterdam’s digital surveillance department succesfully updated its scraper, but following new changes to the Airbnb website since May 2018, Amsterdam now appears to have given up on scraping Airbnb.

This made me curious about the technical characteristics of the Airbnb website. Here are some observations, based on an (admittedly superficial) examination:

  • The initial download of a web page isn’t the final version: after downloading, the contents of the page are dynamically altered using Javascript. For some purposes like navigating search results, you may prefer the final version of the page, which you can get using Selenium. Selenium would especially come in handy for interacting with the calendar to get availability and price information, which seems to be rather tricky.
  • Some details on listings only appear to be available in the Javascript code. You can find them using patterns like '\"lat\"\:(.*?),\"lng\"\:(.*?),'
  • Airbnb uses NGINX to control access to its website. If you request too many pages too fast, you’ll hit a rate limit and get an error page. I guess it should be possible to avoid the rate limit by adding pauses to your programme, but it may take quite some time to figure out how often and how long they should be.

While it appears that barriers to scraping the Airbnb website may be surmountable, it’s quite possible that I underestimate what this would take. If you’d actually build a scraper and would use it to frequently collect information about all local listings, all kinds of new problems might arise.

Meanwhile, other sources of Airbnb data are available. In a previous post, I used data made available by Tom Slee and by Murray Cox’ Inside Airbnb. Slee has since stopped updating his data, but Inside Airbnb is still active. As the Amsterdam Municipality notes in its report, Inside Airbnb has succesfully adapted its scraping technique each time Airbnb changed its website.

UPDATE 13 May - See comments on Twitter: Jens von Bergmann from Vancouver also has a scraper that is working. Following some requests, Tom Slee recently updated his scraper; his code is available on Github.

De Amsterdamse fietser gevisualiseerd

Fietsstad Amsterdam, een nieuw boek van Fred Feddes en Marjolein de Lange, beschrijft hoe Amsterdam een fietsbeleid ontwikkelde (meer over het boek hieronder). Het archief van de Fietsersbond Amsterdam vormde een belangrijke bron voor het boek. Daarnaast is gebruik gemaakt van verkeersgegevens om trends te analyseren.

Een interessante dataset bestaat uit tellingen van het aantal fietsers, auto’s en andere weggebruikers die de stad in- en uitreden, over de periode 1980–2009. De meeste lokaties waar verkeer is geteld liggen op de Singelgracht, die een soort cirkel vormt om het centrum van Amsterdam.

De cijfers zijn telkens gebaseerd op handmatige tellingen op één dag, van 7:00 - 19:00 uur, van het verkeer in beide richtingen.

Ik werd gevraagd om mee te denken over een manier om deze gegevens te visualiseren, een interessante (en erg leuke) klus. Hieronder bespreek ik enkele opties die we hebben overwogen.

Spindiagram

Vanwege de ligging van de tellokaties lag het voor de hand om een cirkelvormige grafiek uit te proberen. De gemeentelijke Dienst Infrastructuur Verkeer en Vervoer was in 2007 ook al op dat idee gekomen. In een factsheet gebruikten ze een spindiagram om de fietstellingen in beeld te brengen.

Overigens noemden ze hun grafiek geen spindiagram, maar waaier. Met een fietsmetafoor legden ze uit hoe de grafiek werkt: «vanuit het middelpunt zijn de telpunten rond de binnenstad verbonden als spaken in een fietswiel».

Het is een mooie grafiek, maar dit grafiektype heeft ook een nadeel. Impliciet wordt de suggestie gewekt dat de oppervlakte binnen de paarse lijn correspondeert met het aantal passeringen, wat eigenlijk misleidend is (zie dit artikel voor een bespreking van een vergelijkbaar probleem). Een andere beperking is dat de grafiek niet laat zien hoe het fietsgebruik is veranderd - al zou je een versie kunnen maken met aparte lijnen voor 1980 en 2009.

Radial lollipop chart

Als alternatief heb ik een radial lollipop chart gemaakt. Althans, zo noem ik hem maar; voor zover ik weet bestond dit grafiektype nog niet. De grafiekbibliotheek die ik gebruik, D3.js, lijkt geen methode te hebben om de ‘spaken’ te tekenen, of in ieder geval kon ik die niet vinden. Ik heb daarom een functie geschreven om het begin- en eindpunt van de lijnen te berekenen. Ik was allang vergeten hoe je sinus en cosinus gebruikt, dus dat moest ik opzoeken. Ik heb de code hier gepubliceerd.

Hieronder een radial lollipop chart die laat zien hoe het fietsverkeer op bijna alle Singelgrachtkruisingen is toegenomen.

En hier een die het tegenovergestelde effect laat zien voor auto’s.

Ik hou er wel van als datapunten buiten het grafiekgebied vallen - al is dit misschien een beetje overdreven. De uitschieters worden veroorzaakt door het feit dat een groot deel van het autoverkeer de route Wibautstraat - IJtunnel gebruikt. Ik had de schaal kunnen aanpassen zodat deze uitschieters binnen het grafiekgebied zouden vallen, maar dan zou het veel moeilijker worden om veranderingen op andere routes en op de fietsgrafiek te onderscheiden.

Vlakdiagram

Ik ben op zich wel gecharmeerd van die radial lollipop chart, maar hij heeft een beperking: hij laat de veranderingen tussen 1980 en 2009 zien, maar niet wanneer die veranderingen zich voordeden. Het autoverkeer nam al af voordat de groei van het fietsverkeer goed op gang kwam, maar op de radial lollipop chart zie je dat niet.

In het boek staat daarom een vlakdiagram, waarbij kleuren corresponderen met de geografische oriëntatie van de kruisingen. Eenvoudig, maar effectief. En als je in de details wil duiken, klik dan hier voor een eerdere schets: fiets, auto.

Over het boek en de tentoonstelling

De Fietsersbond Amsterdam heeft zijn archief overgedragen aan het Stadsarchief. Marjolein de Lange, die een vrijwilligersproject coördineerde om de overdracht voor te bereiden, kwam op het idee om het materiaal te gebruiken als input voor een boek. Dat idee heeft ze vervolgens uitgevoerd samen met auteur Fred Feddes.

Het resultaat is een erg interessant boek over activisme versus samenwerking, over de plek van de fiets in het gemeentelijk beleid en over hoe de toverkracht van de Amsterdamse fietscultuur de doorslag gaf in de epische strijd om de onderdoorgang voor fietsers onder het Rijksmuseum. Verder staat het boek vol fantastische foto’s, kaarten en affiches. Een must voor iedereen die geïnteresseerd is in fietsen, Amsterdam, of actieposters. Er is ook een gratis toegankelijke tentoonstelling in het Stadsarchief (tot 30 juni).

Visualising Amsterdam’s cyclists

Bike City Amsterdam, a new book by Fred Feddes and Marjolein de Lange, recounts how Amsterdam developed a cycling policy (more on the book below). An important source for the book is the archive of the Amsterdam branch of cyclists’ organisation Fietsersbond. In addition, traffic data was used to analyse trends.

An interesting dataset consists of counts of the number of cyclists, cars and other road users moving into and out of Amsterdam’s city centre, over the years 1980–2009. Most of the locations where traffic was counted are on the Singelgracht, which encircles Amsterdam’s city centre.

The data represents manual counts on a single day, between 7am and 7pm, of traffic in both directions.

I was asked to think about a way to visualise this dataset, which posed an interesting challenge (and was a lot of fun to do). Below, I’ll discuss a few of the options we considered.

Radar chart

Given the geographical distribution of counting locations, it seemed to make sense to try a circular chart design. In fact, that idea had also occurred to the city’s infrastructure department. In a 2007 fact sheet, they used a radar chart (or cobweb chart) to visualise the Singelgracht bicycle counts.

Incidentally, they didn’t use the term radar chart, but called it a fan (waaier). They used a bicycle metaphor to describe how it works: «from the middle, the counting locations around the city centre are connected like spokes in a bicycle wheel».

The chart looks really nice, but this chart type also has a drawback: there’s an implicit suggestion that the area within the purple line represents the number of crossings, which is in fact misleading (see this article for a discussion of a similar problem). Another limitation is that the chart doesn’t show how bicycle traffic changed - although it would be possible to make a version with separate lines representing 1980 and 2009.

Radial lollipop chart

As an alternative, I created what I’ll call a radial lollipop chart (to my knowledge, this chart type didn’t exist yet). The chart library that I use, D3.js, doesn’t seem to have a method to draw the ‘spokes’, or at least I couldn’t find it. Therefore, I wrote a function that calculates the start and end points of the lines. I had long forgotten how to use sine and cosine, so I had to look that up. I’ve published the code here.

Here’s a radial lollipop chart showing how cycling has increased at virtually all the Singelgracht crossings.

And here’s one showing the opposite effect for cars:

I love it when a chart has data points that break out of the chart area - although this is perhaps a bit extreme. The outliers are due to the fact that a large share of car traffic uses the Wibautstraat - IJtunnel route. I could have changed the scale to include those outliers, but then changes on other routes as well as changes in bicycle use would have become much more difficult to discern.

Area chart

I rather like the radial lollipop chart, but it has a limitation: it shows changes between 1980 and 2009, but not when those changes happened. Car use started to go down before cycling really started to increase, but from the radial lollipop chart you couldn’t tell.

This is why the chart used in the book is an area chart, with colours corresponding to the broad geographical orientation of the crossings. Simple, but effective. And if you want to explore the details, click here for a draft version of the charts: bicycle, car.

About the book and exhibition

On 4 April, the Amsterdam branch of cyclists’ organisation Fietsersbond has handed over its archive to the Municipal Archive. Marjolein de Lange, who coordinated a volunteer project to prepare the archive, came up with the idea to use the material as input for a book - a project she carried out with author Fred Feddes.

This resulted in a very interesting book about activism versus cooperation; the place of cycling in urban planning; and how the magic power of Amsterdam’s cycling culture decided the epic fight for the right to cycle through the passage under the Rijksmuseum. The book, which contains a wealth of great photos; maps and posters, is a must-read for anyone interested in cycling, Amsterdam, or activist poster design. It’s been published both in Dutch and in English. There’s also an exhibition at the Municipal Archive (until 30 June, Vijzelstraat 32, access is free).

Pages