Converting Election Markup Language (EML) to csv
Note that the map above isn’t really a good illustration here because I used a different data source to create it.
Getting results of Dutch elections at the municipality level can be complicated, but what if you want to dig a little deeper and look at results per polling station? Or even per candidate, per polling station? For elections since 2009, that information is available from the data portal of the Dutch government.
Challenges
The data is in Election Markup Language, an international standard for election data. I didn’t know that format and processing the data posed a bit of a challenge. I couldn’t find a simple explanation of the data structure, and the Electoral Board states that it doesn’t provide support on the format.
For example, how do you connect a candidate ID to their name and other details? I think you need to identify the Kieskring (district) by the contest name of the results file. Then, find the candidate list for the Kieskring and look up the candidate’s details using their candidate ID and affiliation. But with municipal elections, you have to look up candidates in the city’s candidate list (which doesn’t seem to have a contest name).
Practical tips
If you plan to use the data, here are some practical tips:
- Keep in mind that locations and names of polling stations may change between elections.
- If you want to geocode the polling stations, the easiest way is to use the postcode, which is often added to the polling station name (only for recent elections). If the postcode is not available or if you need a more precise location, the lists of polling station names and locations provided by Open State (2017, 2018) may be of use. Use fuzzy matching to match on polling station name, or perhaps you could also match on postcode if available. Of course, such an approach is not entirely error-free.
Further, note that the data for the 2017 Lower House election is only available in EML format for some of the municipalities. I guess this has something to do with the fact that prior to the election, vulnerabilities had been discovered in software to count the votes, so they had to count the votes manually.
Python script
Here’s a Python script that converts EML files to csv. See caveats there.
UPDATE 23 February 2019 - improved version of the script here.