Big data, big money: how companies thrive on informational resources

Information oils the economy – as we know since the path-breaking research of George Akerlof, Michael Spence and Joseph Stiglitz in the 1970s – and information can be extracted from data. Today, increased availability of “big” data creates the opportunity to access ever more information – for the good of the economy, then.

But in practice, how do companies extract value from this increasingly available information? In a nutshell, there are three ways in which they can do so: matching, targeted advertising, and market segmentation.

Matching is the key business idea of many recently-created companies and start-ups, and consists in helping potential parties to a transaction to find each other: driver and passenger (Uber), host and guest (Airbnb), buyer and seller (eBay), and so on. It is by processing users’ data with suitable algorithms that matching can be done, and the more detailed are the data, the more satisfactory the matching. Firms’ business model is usually based on taking a fee for each successful transaction (each realized match).

Targeted advertising is the practice of selecting, for each user, only the ads that correspond at best to their tastes or practices. Publicizing diapers to the general population will be largely ineffective as many people do not have young children; but targeting only those with young children is likely to produce better results. Here, the function of data is to help decide what to advertise to whom; useful data are people’s socio-demographic situation (age, marriage, children…), their current or past practices (if you bought diapers last week, you might do that again next week), and any declared tastes (for example as a post on Facebook or Twitter). How this produces a gain is obvious: if targeted adverts are more effective, sales will go up.

Continue reading “Big data, big money: how companies thrive on informational resources”

Discussing platform cooperativism

On Monday, 7 December 2015 at Telecom ParisTech, I was discussant at a seminar by New School scholar Trebor Scholz on “Unpacking Platform Cooperativism“.

ECN1

Internet platforms carry an unprecedented potential of value creation, exploiting the extraordinary power of data and algorithms to extract and distribute information to an extent never seen before. Information, we know since Hayek’s times, is the fuel that keeps markets going, that eliminates “lemons” and ensures an ever-better coordination between buyers and sellers, borrowers and lenders, or landlords and tenants. At the same time, the internet has channeled the dream of a viable non-market society, since Rheingold’s 1993 revival of the “community” and Barbrook’s 1998 “hi-tech gift economy“. So, can we put this informational efficiency to the service of a more humane economy, based on relationships, solidarity and reciprocation, rather than on the sheer market system?

The so-called “sharing economy” suggests answers, but also displays a tension: the efforts of myriad grassroots associations to develop collaboration as a value and a practice, sharply contrasts the spectacular growth of firms like Airbnb and Uber, now large multinationals, and their alleged cavalier attitude to anti-trust regulations and workers’ rights. If some say Uber is not really about sharing and collaboration, it is difficult to draw the line.

This ambiguity is fostered by a public discourse that focuses on the sharing of assets – the spare room in your home, or a sit in your car – that digital platforms enable. Asset-sharing has economic and social appeal: it increases efficiency by preventing assets from lying idle, while reducing waste, shifting emphasis away from consumerist values (“access is better than ownership“), and facilitating sociality beyond mere consumption.

But it is often forgotten that asset-sharing does not produce value by itself: it involves extra labour. In economic jargon, capital and labour and complementary production factors. In practice, if you want to put your spare room on Airbnb, you must produce an ad, monitor your message inbox and reply swiftly. You must clean the room and do the laundry before and after a guest’s visit. You must show your guests around when they arrive.

More importantly, the very opportunity of asset-sharing changes the incentives that shape labour supply – people’s willingness to sell their time and effort against a payment. Because of the expected compensation, some people will renounce use of a (non-spare) room to accommodate visitors instead, and others will do more journeys to drive passengers around – so it’s not really about sharing unused assets, it is about self-employment and starting a micro-business. A work opportunity as a complement to (and sometimes a substitute for) a main job.

This is where debates on internet platforms and the sharing economy rejoin the growing literature on digital labour — and where the contribution of Trebor Scholz is illuminating. Where others see assets (ie, capital), he sees labour. He shows us that the bottlenecks here are about labour, not capital, and that the success — be it economic or social– of the sharing economy is closely tied to the destiny of labour. Whether it appears on the surface as self-employment, micro-entrepreneurship or salaried work, doesn’t really matter. Trebor reminds us of Marx’s fundamental principle that production relations are central to our (capitalist) society, and value generation rests ultimately on labor. If this very crucial part of the human experience goes wrong, even the best side of the sharing economy – the one that endorses trust, reciprocity, and zero-waste – may fail to perform any transformative effects on society.

ECN2

Continue reading “Discussing platform cooperativism”

World Statistics Day 2015

This week was World Statistics Day, celebrated at the UN and in individual countries around the world. While celebWSD2rating the successes of official statistics throughout its history of producing vital information for governments and citizens, this time much of the debate focused on its – more uncertain – future. The landscape is rapidly changing, swiftly shifting from a data-scarce to a data-rich world, from structured to unstructured data, from the quasi-monopoly of official statisticians on the production of information to fier competition, from pure statistics to multi-disciplinarity and the rise of so-called “data science”. There are obvious opportunities, but also formidable challenges, and it is always difficult for large organisations (such as statistical institutes) to adapt.

The President of the IAOS urged official statisticians to stick to the UN-backed Fundamental Principles of Official Statistics as a guide. She focused on the efficiency and ethics of engaging with users and the private sector, combined with the rigour of methods, to deliver “better data for better lives” (the slogan of the day).

Continue reading “World Statistics Day 2015”

#bigdataBL

On Friday last week, the British Sociological Association (BSA) held an event on “The Challenge of Big Data” at the British Library. It was interesting, stimulating and relevant – I was particularly impressed by the involvement of participants and the very intense live-tweeting, never so lively at a BSA event! And people were particularly friendly and talkative both on their keyboards and at the coffee tables… so in honour of all this, I am choosing the hashtag of the day #bigdataBL as title here.

bigdataBL(Visualisation: http://www.digitalcoeliac.com/)

Some highlights:

  • The designation of “big data” is from industry, not (social) science, said a speaker at the very beginning. And it is known to be fuzzy. Yet it becomes a relevant object of scientific inquiry in that it is bound to affect society, democracy, the economy and, well, social science.
  • Big-data practices change people’s perception of data production and use. Ordinary people are now increasingly aware that a growing range of their actions and activities are being digitally recorded and stored. Data are now a recognized social object.
  • Big data needs to be understood in the context of new forms of value production.
  • So, social scientists need to take note (and this was the intended motivation of the whole event). The complication is that Big Data matter for social science in two different ways. First, they are an object of study in themselves – what are their implications for, say, inequalities, democratic participation, the distribution of wealth. Second, they offer new methods to be exploited to gain insight into a wide range of (traditional and new) social phenomena, such as consumer behaviours (think of Tesco supermarket sales data).
  • Put differently, if you want to understand the world as it is now, you need to understand how information is created, used and stored – that’s what the Big Data business is all about, both for social scientists and for industry actors.

Continue reading “#bigdataBL”

What is data?

All the hype today is about Data and Big Data, but this notion may seem a bit elusive. My students sometimes struggle understanding the difference between “data” and “literature”, perhaps because of the unfortunate habit to call library portals “databases”. Even colleagues are sometimes uncomfortable with the notion of data (whether “big” or “small”) and the breadth it is now taking. So, a definition can be helpful.

Data  are pieces of unprocessed information – more precisely raw indicators, or basic markers, from which information is to be extracted. Untreated, they hardly reveal anything; subject to proper analysis, they can disclose the inner working of some relevant aspects of reality.

The “typical” example of socioeconomic data is the observations/variables matrix, where each row represents an observation – an individual in a population – and each column represents a variable – a particular indicator about that individual, for example age, gender, or geographical location. (In truth data types are more varied and may also include unstructured text, images, audio and video; But for the sake of simplicity, let’s stick to the Matrix here.)

 Fig11a

Continue reading “What is data?”

Data in the public sector: Open data and research data

OpendataThe “open data” movement is radically transforming policy-making. In the name of transparency and openness the UK, US and other governments are releasing large amounts of records. It is a way to hold the government to account: in UK for example, all lobbying efforts in the form of meetings with senior officers are now publicly released. Data also enable the public to make more informed decisions: for example, using apps from public transport services to plan their journeys, or tracking indicators of, say, crime or air pollution levels in their area to decide where to buy property. Data are provided as a free resource for all, and businesses may use them for profit.

The open data movement is not limited to the censuses and surveys produced by National Statistical Institutes (NSIs), the public-sector bodies traditionally in charge of collecting, storing and analyzing data for policy purposes. It extends to other administrations such as the Department for Work and Pensions or the Department for Education in the UK, which also gather and process data, though usually through a different process, not using questionnaires but rather registers.

Continue reading “Data in the public sector: Open data and research data”

Big data: Quantity or quality?

The very designation of “Big” Data suggests that size of datasets is the dividing line, distinguishing them from “Small” Data (the surveys and questionnaires traditionally used in social science and statistics). But is that all – or are there other, and perhaps more profound, differences?

Let’s start from a well-accepted, size-based definition. In its influential 2011 report, McKinsey Global Institute depicts Big Data as:

“datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze”.

Similarly, O’Reilly Media (2012) defines it as:

“data that exceeds the processing capacity of conventional database systems”.

The literature goes on discussing how to quantify this size, typically measured in terms of bytes. McKinsey estimates that:

“big data in many sectors today will range from a few dozen terabytes to multiple petabytes (thousands of terabytes)”

This is not set in stone, though, depending on both technological advances over time and specific industry characteristics.

Continue reading “Big data: Quantity or quality?”

Hallo world – a new blog is now live!

Hallo Data-analyst, Data-user, Data-producer or Data-curious — whatever your role, if you have the slightest interest in data, you’re welcome to this blog!

This is the first post and as is customary, it needs to tell what the whole blog is about. Well, data. Of course! But it aims to do so in an innovative, and hopefully useful, way.

DataBigAndSmall2

Continue reading “Hallo world – a new blog is now live!”