I am now in Montréal, where I participated, last Friday, in a panel on Open Data at “Science & You” international conference. It was interesting for me to reflect on how the picture has changed since my previous panel on the same topic – in Kiev in 2012. Back then, we were busy trying to convince public administrations that data opening was good for transparency and could help improve services to communities. Since then, a lot of attempts have been made in numerous countries – local authorities often pioneering the process, followed only later by central governments (one example cited in my panel was Québec City). What is made open is typically information from public registers (first names of newborns, records of road accidents) and increasingly, from technological devices and sensors (bus traffic information).
There are some conditions to be met for a dataset to be said “open”:
- Technically, it needs to be “raw”, detailed, digital and reusable. The French Interior Ministry released results of the first round of the recent presidential elections within a few days, at polling station level. This is sufficiently detailed (with over 69,000 polling stations throughout the country), raw (allowing aggregations, comparisons etc.), and digital/reusable (so much so that the newspaper Le Monde could develop a user-friendly application to let readers easily check results in their neighborhoods). Some would also insist that “open” data should be released in non-proprietary formats (better .csv than .xls, for example).
- Legally, the data must come with a license that allows re-use by third parties (typically within the Creative Commons family). Ideally, no type of reuse should be ruled out (including somewhat controversially, commercial / for-profit reuse).
- Economically, the data should be available to all for free (or at least with minimal charges if data preparation requires extra work or expenses).
If in the past few years, a lot of thought has been devoted to the “ideal” conditions for data opening and how this would positively affect public service, the data landscape has now significantly changed.
Indeed open data is no longer a pure government matter as today, new actors are entering the scene. There are civil society associations and individual citizens involved in participatory co-creation initiatives, such as the “OpenStreetMap” project (which crowd-sources the collection of geographical data and makes the results available to all for free). There are also private companies, for which utilization and production of data have become lucrative opportunities: even Google Street Map, against which OpenStreetMap aims to compete, is partly open to the extent that software applications from other users can access and reuse (under some conditions) Google’s geographical data. For a platform like Google, the interest of opening part of its data is clear in that it draws to it a larger and larger ecosystem of firms and applications that depend on its resources to operate.
In parallel, the scope of “open data” is redefined as it gets closer to “big data”, the massive amounts of information that derive from the digital traces of the activities that individuals and organizations perform through technologies (use of store loyalty cards, logs of mobile phone calls, posts on social media). While not all “open” data is “big” (and vice versa), it is often believed that the most promising economic and social developments of today reside in the intersection of the two: for example, opening up large amounts of local transport data could lead to better traffic forecasts, thereby improving mobility in the city.
These possibilities point to the problem of personal data, traditionally excluded from the scope of data opening, but exploited massively by private, “big data” using companies, sometimes without full awareness (not to mention consent) of the individuals concerned. New social movements are emerging today, aiming to restore people’s control over their data, while encouraging them to share these data (selectively) whenever openness can bring communal benefits. For example, the Swiss co-operative midata.coop aims to offer its members a secure online space where they can regroup their health data (a particularly sensitive area!), and they can decide to give access to subsets of these data to other patients, to physicians or to researchers for the development of new treatments. While these efforts are still in their infancy, they can rely on recent technological developments that aim to manage access to detailed individual data requiring high-security conditions.
We need to observe these movements closely, as they may show the way for future developments in these areas. I am sure we will remember them – and hopefully, we will be pleased at their success – in future panel on open data – in some parts of the world, a few years from now.