champagne anarchist | armchair activist

Data

Has Google Maps found a way to have its cake and eat it

PL Takbuurt

Not interesting: P.L. Takbuurt

Google Maps are for transportation; Apple Maps are more of an advertising channel, I tweeted a while ago. That was based on a fascinating analysis by Justin O’Beirne, who found, among other things, that Google Maps show far more rail and underground stations, while Apple Maps show far more restaurants and shops.

However, things may have changed in a way. The CityMetric blog of the New Statesman reports that Google has been adding orangey areas to its maps. As Google explains, they represent areas of interest:

Whether you’re looking for a hotel in a hot spot or just trying to determine which way to go after exiting the subway in a new place, areas of interest will help you find what you’re looking for with just a couple swipes and a zoom.

We determine areas of interest with an algorithmic process that allows us to highlight the areas with the highest concentration of restaurants, bars and shops. In high-density areas like NYC, we use a human touch to make sure we’re showing the most active areas.

Assuming they haven’t sacrificed any stations, this suggests they have found a way to have their cake and eat it: remain useful for transportation purposes while adding marketing opportunities.

However, CityMetric writer John Elledge is not impressed by Google’s algorithm to identify areas of interest. He argues that «an algorithm that thinks Trafalgar Square is less an area of interest than the restaurants across the road is not fit for purpose».

As for Amsterdam, Google’s algorithm seems to be relatively good at identifying lively neighbourhoods, although they may have missed a few. On the other hand, the Museumplein, where the Rijksmuseum, Van Gogh Museum and Stedelijk are, isn’t marked as interesting, but then I’m sure tourists don’t need Google Maps to tell them to go there. Some of the most spectacular examples of Amsterdam School architecture (around P.L. Takstraat, Zaanhof) are similarly overlooked. By contrast, rather dull shopping centres such as Oostpoort are marked as interesting.

All in all, the correct designation for Google’s orangey areas would perhaps be commercial areas rather than areas of interest.

Users versus programmers: lon,lat or lat,lon

Somebody at Mapbox wrote a blog post in which he makes the case that longitude should go first: almost all data formats (including Google’s KML) and all open source software (except Leaflet) use this order. Also, it’s the logical order if you include altitude (XYZ), he argues.

Of course, it can’t be that simple, as this debate on Stack Overflow illustrates. It seems that programmers prefer lon,lat while people who use maps - seafarers, Google Maps users - expect lat,lon. As one commenter puts it:

Good rule of thumb: if you know what a tuple is and are programming, you should be using lon,lat. I would even say this applies if your end user (say a pilot or a ship captain) will prefer to view the output in lat,lon. You can switch the order in your UI if necessary, but the overwhelming majority of your data (shapefiles, geojson, etc.) will be in the normal Cartesian order.

Another good rule of thumb: always check.

Is it still ok to ridicule pie charts

Workers without job security as a percentage of all working people in the Netherlands. The pink slice shows the percentage in 2003; the red slice how much this has increased since. Data Statistics Netherlands, chart dirkmjk.nl. Relaunch animation.

In a series of articles that caused a bit of a commotion among chart geeks, Robert Kosara summarised the findings of a number of studies on pie charts. In one of the articles, he observes:

Pie charts are generally looked down on in visualization, and many people pride themselves on saying mean things about them and the people who use them.

I guess I’m one of those people who look down on pie charts. Sure, I’m not as outspoken as the respected Edward Tufte, who famously wrote that «the only worse design than a pie chart is several of them». I’m not always against pie charts and I’ve even experimented with animated pie charts to illustrate change in a proportion. But I’m not above making lame jokes about pie charts either. My rule of thumb would be: don’t use pie charts - unless you can come up with a good reason why you should use one in a particular situation.

Kosara describes a number of studies in which he measured how accurately people interpret pie charts and other charts showing a proportion, e.g. 27%. According to his findings, exploded pie charts are doing worse than regular pie charts (phew!) and square pie charts are doing better. Interestingly, a stacked bar chart appears to be doing worse than a regular pie chart (note that a stacked bar chart depicting a single proportion amounts to something that looks like a progress bar).

It’ll be interesting to see how this holds up in future studies. But for now, the finding that (stacked) bar charts are doing worse than pie charts may come as a bit of a shock, for there appears to be a sort of consensus that bar charts are generally better than pie charts. Question is, better at what?

Workers without job security as a percentage of all working people in the Netherlands. Data Statistics Netherlands, chart dirkmjk.nl.

A bar chart is quite good at showing that the level of workers without job security in the Netherlands was higher in 2015 than in 2014. But which chart type is better at showing how much the share has increased between 2003 and 2015? Until recently I would have said «the bar chart» without hesitation, but now I’m not so sure anymore.

That said - I think it’s still ok to ridicule 3D exploded pie charts.

Robert Kosara summarises his findings here and here. The recent studies were done in collaboration with Drew Skau; an older study in collaboration with Caroline Ziemkiewicz. The Tufte quote is from his book The Visual Display of Quantitative Information. The charts above show workers with permanent jobs and a fixed number of hours per week, as a percentage of all working people in the Netherlands (not just employees), source CBS.

Pretentieuze instellingsnamen

Ik weet zeker dat ik erover gelezen heb, al weet ik niet meer waar: instellingen die een pseudohistorische fantasienaam aannemen, al dan niet na een fusie. Vaak namen waar ‘ae’ in voorkomt, of eventueel ‘ck’ of ‘gh’. Het komt trouwens niet alleen voor bij instellingen (‘Daelzicht’), maar ook bijvoorbeeld bij kantoorpanden (Rivierstaete).

Zou het vaak voorkomen? Een deel van het antwoord vind je op de kaart.

[Update] - Hier wordt het verschijnsel ook al aan de kaak gesteld!

Should freedom of information apply to algorithms?

[Update below] - Governments increasingly use data analysis to make decisions that affect citizens. But how transparent are these practices? In a study summarised here, Nicholas Diakopoulos had students file freedom of information requests to obtain, among other things, the algorithms behind government decision-making. Most requests were denied, for a variety of reasons. Some states claimed algorithms aren’t «documents» covered by FOI legislation; others said they were copyrighted.

The article reminded me of the risk profiles Dutch municipal welfare agencies use to decide who to submit to rigorous checks - including very intrusive home searches. As early as 2006, I was involved in a survey by Dutch trade union FNV which found that two in five municipalities used risk profiles for that purpose:

This has the advantage that for a large group of people, unnecessary routine checks can be dispensed with. However, there’s virtually no debate about what criteria can be used without causing unacceptable unequal treatment. Is it ok to select people because they’ve worked in the catering industry, or as a self-employed person? Or because of their nationality?

When the government uses algorithms exert control over citizens (or when they outsource that task, for that matter), there should be accountability. So would it be possible to obtain such algorithms through an FOI request?

I found one decision that suggests that algorithms aren’t a priori excluded from FOI requests - at least so in the eyes of the Utrecht municipality (I used Open State’s FOI search engine to find it). But welfare recipients’ organisation Bijstandsbond informed me that an FOI request has been filed in the past to obtain the risk profiles used by the Amsterdam municipal welfare agency. The request was denied.

[Update 2 July 2016] - Aside from the question whether you can FOI an algorithm, in Europe it may become possible to ask for «an explanation of the decision reached after [algorithmic] assessment» as a result of the EU’s General Data Protection Regulation, according to this analysis. Not only would this create more transparancy; it would also put technical constraints on programmers in that their algorithms have to be interpretable.

Pages