Science, like the rest of human life, is subject to fashions. Data visualisation is the latest trend: policy-makers and the public are all under its charm, and researchers magically suspend their disbelief — give me a fancy image, and I won’t look too closely at your p-values. So I was intrigued by the discovery, at a talk few days ago by Paul Jackson of the Office for National Statistics, that there are precedents, and that they have a long history behind them.
The story is that of John Snow, an epidemiologist who was persuaded, against the received wisdom of the mid-nineteenth century, that cholera does not propagate through air but through contaminated water or food. But how to convince others? When cholera struck London in 1854, Snow began plotting the location of deaths on a map of Soho: he represented each death through a line parallel to the building front in which the person died.
Snow soon realised that there was a concentration of “death lines” around Broad Street — more specifically, around a water pump at the corner between Broad and Cambridge St.
He managed to convince the authorities to remove the handle of the pump, so that people could no longer use it: in a few days, the number of deaths in the area plummeted. Snow had proven his point and saved lives: using no medical trials, no sophisticated chemistry, just with some basic count statistics, and a clever dataviz.
A major health data plan is on the verge of being called off, to never have a chance again. It is supposed to anonymise all the patient records in the National Health Service (NHS) in the UK, linking them together into one single, giant database, and making them available under controlled use conditions to health researchers and (controversially) to commercial companies too. Public outcry has led to the plan being delayed for six months.
In an article published in The Guardian last week, Ben Goldacre, a medical doctor and high-profile media commentator on science matters, rightly identifies what the point is: in principle, the public accepts release of data for scientific purposes, but resists commercial exploitation. And rightly so: medical knowledge results from the study of several cases, and the higher the availability of cases, the more accurate the results; in the era of big data, it is also clear that aggregation and sharing of a wealth of data such as those held by the NHS is a unique opportunity for medical science to discover ways of saving lives. On the other hand, use of data for any other purposes looks much more opaque, and people understandably feel it might lead to discrimination and potentially negative individual consequences, for example if disclosure of the health history of a person results in higher insurance premiums, or rejection of job applications.
Continue reading “Sharing medical data for research: Why we should all care”
Our new book Against the Hypothesis of the End of Privacy is out now! It has been published by Springer and co-authored with Antonio A. Casilli (Telecom ParisTech) and Yasaman Sarabi (University of Greenwich). Please check here for regular updates about the book.
Network data are among those that are changing fastest these days. When I say I study social networks, people almost automatically think of Facebook or Twitter –without necessarily realizing that networks have been around for, well, the whole history of humanity, long before the internet. Networks are just systems of social relationships, and as such, they can exist in any social context — the family, school, workplace, village, church, leisure club, and so forth. Social scientists started mapping and analysing networks as early as the 1930s. But people didn’t think of their social relationships as “networks” and didn’t always see themselves as “networkers” even if they did invest a lot in their relationships, were aware of them, and cared about them. The term, and the systemic configuration, were just not familiar. There was something inherently informal and implicit about social ties.
What has changed with Facebook and its homologues, is that the network metaphor has become explicit. People are now accustomed to talking about “networks”, and think in systemic terms, seeing their own relationships as part of a more global structure. Network ties have become formal — you have to make a clear choice and action when you add a “friend” on Facebook, or “follow” someone on Twitter; you will have a list of your friends/followers/followees (whatever the specific terminology is) and monitor changes in this list. You know who the friends of your friends are, and can keep track of how many people viewed your profile /included you in their “lists” / mentioned you in their Tweets. Now, everyone knows what networks are –so if you are a social network researcher and conduct a survey like in the old days, you won’t fear your respondents may misunderstand. In fact, you may not even need to do a survey at all –the formal nature of online ties, digitally recorded and stored, makes it possible to retrieve your network information automatically. You can just mine network tie data from Facebook, Twitter, or whatever service your target populations happen to be using.
Continue reading “Network data, new and old: from informal ties to formal networks”
The growth of “big data” changes the very essence of modern markets in an important sense. Big data are nothing but the digital traces of a growing number of people’s daily transactions, activities and movements, which are automatically recorded by digital devices and end up in huge amounts in the hands of companies and governments. Payments by debit and credit cards record timing, place, amount, and identity of payer and payee; supermarket loyalty cards report purchases by type, quantity, price, date; frequent traveler programs and public transport cards log users’ locations and movements; and CCTV cameras in retail centers, buses and urban streets capture details from clothing and gestures to facial expressions.
This means that all our market transactions – purchases and sales – are identifiable, and our card providers know a great deal about our economic actions. Our consumption habits (and income and tastes) may seem more opaque to scrutiny but at least to some extent, can be inferred from our locations, movements, and detail of expenses. If I buy some beer, maybe my supermarket cannot tell much about my drinking; but if I never buy any alcohol, it will have strong reasons to conclude that I am unlikely to get drunk. As data crunching techniques progress (admittedly, they are still in their infancy now), my supermarket will get better and better at gauging my habits, practices and preferences.
Continue reading “Big Data redefine what “markets” are”