Counting online workers

I have just discovered this very interesting new paper by Otto Kässi, Vili Lehdonvirta and Fabian Stephany. Their data-driven count of online workers is not without reminding of this research published last year, which I did with Clément Le Ludec and Antonio A. Casilli.

There are differences of course: theirs is a large multi-country study while we focused on one national setting (France). Also: Kässi et al. consider online labour in general, while we looked specifically at micro-work.

Nevertheless, there are striking similarities. Both studies included larger as well as smaller and more peripheral platforms, often left aside in previous research. Both started from the numbers of registered users declared by the platforms in scope, although this is likely an upper bound. Indeed registering may not mean using, and for example researchers (like ourselves) and journalists would register only to observe, especially when registration is open and easy.

Also, both studies used web traffic analysis data but for different purposes. We used them as an estimate of minimally active users – those who connect at least monthly, as per the definition given by the providers of these data. For the platforms we observed, these numbers tend to be lower than registrations.

Instead, Kässi et al. have used these data to assess registration numbers for the platforms that do not report them. My first reaction would be to think their estimates are likely a lower bound. But presumably their use of a mix of sources, and the seriousness and caution with which they have conducted their estimate, provide enough correction.

Finally, both studies attempted to correct estimates downward by taking into account multi-homing – the tendency of users to rely on multiple platforms. The coefficient of Kässi et al. is 1.83, ours was 1.27. The gap is due to the fact that we focused only on micro-work: if we had counted participation across all types of online labour platforms, our coefficient would be just below 2 – not far from theirs! Kässi et al. also correct for the possibility of multiple workers using a single account, which we did not observe in our French sample. One might imagine other corrections depending on observed usages. For example my ongoing Latin American study of micro-workers suggests that there are unofficial sales and purchases of highly rated platform accounts, more likely to access better-paying tasks – again, something we did not observe in France. Kässi et al. rightly note that all these corrections come from ad hoc surveys and should be interpreted with caution.

Overall, I would say that both studies point to the need to put in place new and creative methods to account for these new forms of labour that traditional statistical studies fail to capture well. The price to pay, as both studies stress, is a high degree of uncertainty. I also dare suggest that both are mixed-method studies: while the design is essentially quantitative, input from smaller and even qualitative research is crucial – for example to get insight into multi-homing and multi-working.

Before concluding, let us recall the key results. Kässi et al. reckon that there are 163 million freelancer profiles registered on online labour platforms globally, of whom approximately 19 million have worked at least once, and 5 million work more intensely. We estimated that approximately 260,000 French residents are registered with micro-work platforms, of whom some 50,000 are ‘regular’ workers who do micro-tasks at least monthly, and a more restrictive measure of ‘very active’ workers would decrease this figure to 15,000.

Are these numbers large or small? Curiously, our French study attracted both criticisms: some worried that we might be overstating the importance of micro-work, others wondered why we bothered for such a tiny part of national GDP. It is not easy to answer this question, as the answer depends on the perspective taken and the goals – the same numbers would mean different things to policymakers and researchers, for example. Nevertheless, I think that the point that is important to all, is to say that this population exists and needs attention – despite its limited visibility and the fuzzy boundaries that make it so difficult to assess its size.

Big data and the hypothesis of the end of privacy

In the late 2000s, voices suggesting that our societies might be nearing the ‘end of privacy’ became increasingly deafening. Our cultural, political and regulatory environment was on the verge of major transformation – so went the narrative. Businesses rejoiced as notoriously, less privacy and more information oils the economy.

In a video interview with Italian media Idee Sottosopra, I review the courses of action taken by various stakeholders, in particular Internet companies, and examine their conflicts and controversies. I show how the very concept of privacy, inherited from a long legal and judicial tradition, should be revised and redefined to appropriately describe today’s online interactions.

Overall, there is no deterministic and inevitable tendency to exclude privacy from our societies, but rather a tension between social forces for and against privacy, which has accompanied the advent of the digital economy and especially social media. The positions of stakeholders, especially users, are often ambiguous, and social media companies attempted to leverage this ambiguity to their own advantage.

Yet civil society reactions have been stronger and stronger, and after initial David-vs-Goliath attempts of individuals and small associations, more and more authoritative institutions have taken seriously the defence of privacy. We are no longer left to costly and little-visible individual choices, and especially after entry into force of GDPR in Europe, we have now an unprecedented opportunity to act at a more systemic level.

Big Data. L’ipotesi della fine della della Privacy | Società Digitale | Idee Sottosopra

How many ‘micro-workers’?

Finally published! Counting `micro-workers’: Societal and methodological challenges around new forms of labour is a paper that I co-authored with Clément Le Ludec and Antonio A. Casilli, and that hs just been published in a special issue of the journal Work Organisation, Labour & Globalisation.

What is it about? ‘Micro-work’ consists of fragmented data tasks that myriad providers execute on online platforms. While crucial to the development of data-based technologies, this little visible and geographically spread activity is particularly difficult to measure. To fill this gap, we combined qualitative and quantitative methods (online surveys, in-depth interviews, capture-recapture techniques, and web traffic analytics) to count micro-workers in a single country, France. On the basis of this analysis, we estimate that approximately 260,000 people are registered with micro-work platforms. Of these some 50,000 are ‘regular’ workers who do micro-tasks at least monthly, and we speculate that using a more restrictive measure of ‘very active’ workers decreases this figure to 15,000. This analysis is important to better understand platform labour and the labour in the digital economy that lies behind artificial intelligence.

Le moment big data des sciences sociales: quel accès aux données du web et des médias sociaux ?

Table ronde, Sciences Po Paris, 6 décembre 2018, 18h00


Pour que la recherche en sciences sociales puisse pleinement tirer profit des grandes bases de données numériques, un verrou reste à lever : l’accès à ces données est limité, inégalement distribué, et entouré d’un flou juridique et déontologique. Nous proposons d’en discuter à l’occasion de la parution du numéro spécial de la Revue Française de Sociologie sur “Big data, sociétés et sciences sociales” (n. 59/3). Cette table ronde réunit les chercheur.e.s avec d’autres parties prenantes publiques et

Avec :

  • Garance Lefèvre, Policy senior associate, Uber
  • Roxane Silberman, Conseillère scientifique, Centre d’Accès Sécurisé aux Données (CASD)
  • Sophie Vulliet-Tavernier, Directrice des relations avec les publics et la recherche, Commission Nationale de l’Informatique et des Libertés (CNIL)
  • Les auteurs du numéro spécial.

Modérateurs : Gilles Bastin (Univ. Grenoble Alpes) et Paola Tubaro (CNRS), coordinateurs du numéro spécial.

Entrée libre et gratuite, dans la limite des places disponibles: pour s’inscrire, cliquez ici.

Accès : Sciences Po, salle Goguel. Entrée par le 27 rue Saint-Guillaume, 75007 Paris (traverser le jardin et prendre l’ascenseur jusqu’au dernier étage). La table ronde est organisée par la Revue Française de Sociologie en collaboration avec les Presses de Sciences Po. Elle sera suivie d’un pot.


Call for papers: “Recent Ethical Challenges in Social Network Analysis”

Submissions are now invited for a special section of the journal Social Networks on “Recent Ethical Challenges in Social Network Analysis” (guest-edited by myself with Antonio A. Casilli, Alessio D’Angelo, and Louise Ryan).

Research on social networks raises formidable ethical issues that often fall outside existing regulations and guidelines. State-of-the-art tools to collect, handle, and store personal data expose both researchers and participants to new risks. Political, military and corporate interests interfere with scientific priorities and practices, while legal and social ramifications of studies of personal ties and human networks come to the surface.

The proposed special section aims to critically engage with ethics in research related to social networks, specifically addressing the challenges that recent technological, scientific, legal and political transformations trigger.

Following a successful workshop on this topic that was held in December 2017 in Paris, we welcome submissions that critically engage with ethics in research related to social networks, possibly based on reflective accounts of first-hand experiences or case studies, taken as concrete illustrations of the general principles at stake, the attitudes and behaviors of stakeholders, or the legal and institutional constraints. We are particularly interested in novel, original answers to some unprecedented ethical challenges, or the need to re­interpret norms in ambiguous situations.

The full Call for Papers is available here.

More than complex: large and rich network structures

I co-organize this Satellite to the NETSCI2018 Conference in Paris, 12 June 2018. We are now accepting submissions of proposals for presentations.

Information on the Satellite

In traditional research paradigms, sociology handles small but rich networks where the richness of network attributes is derived from the specific buildup of the data collection process. In the sociological approach, differences among nodes and edges are key to describe network properties and the ensuing dynamical social processes. Instead, the complex systems tradition deals with large but poor networks. Assuming statistical equivalence of graph entities, a mean field treatment serves to describe the aggregate properties of the network. Today’s network datasets contain an unprecedented quantity of relational information at all, and between all, the possible levels: individuals, social groups, political structures, economical actors, etc. We finally deal with large and rich network structures that expose the implicit limitations of the two abovementioned approaches: the traditional methods from social science cannot be upscaled because of their algorithmic complexity and those from complex systems lose track of the complex nature of the actors, their relationships and their processes. This workshop has the aim of developing an interdisciplinary reflection on how methods from social science could be upscaled to large network structures and on how methods from complex systems could be downscaled to deal with small heterogeneous structures.

We are proud that five prominent international scholars are our invited speakers: Camille Roth, SciencesPo Paris; Matthieu Latapy, LIP6UPMC Paris; Alessandro Lomi, ETH Zurich; Fariba Karimi, GESIS Cologne; Noshir Contractor, Northwestern University.


We invite abstracts of published or unpublished work for contributed talks to take place at the satellite symposium. We expect a broad range of topics to be covered, across theory, methodology, and application to empirical data, relating to an interdisciplinary reflection on how methods from social science could be upscaled to large network structures and on how methods from complex systems could be downscaled to deal with small heterogeneous structures.

Submission can be made through our website.

Submissions are required to be at most 650 words long and should include the following information: title of the talk, author(s), affiliation(s), email address(es), name of the presenter, abstract. Papers or submissions longer than 1 page will not be accepted.

Important dates

Abstract submission deadline is March 25, 2018. Notification of acceptance will be no later than April 23, 2018.

All participants and accepted speakers will have to register through the NETSCI2018 website.

Rethinking ethics in social-network research

File 20171211 15358 w51s6s.jpg?ixlib=rb 1.1
Social links.

Antonio A. Casilli, Télécom ParisTech – Institut Mines-Télécom, Université Paris-Saclay et Paola Tubaro, Centre national de la recherche scientifique (CNRS)

Fueled by increasingly powerful computing and visualization tools, research on social networks is flourishing. However, it raises ethical issues that largely escape existing codes of conduct and regulatory frameworks. The economic power of large data platforms, the active participation of network members, the spectrum of mass surveillance, the effects of networking on health, the place of artificial intelligence: so many questions in search of solutions.

Social networks, what are we talking about?

The expression “social network” has become common, but those who use it to refer to social media as Facebook or Instagram often ignore its origin and its true meaning. The study of social networks precedes the advent of digital technologies. Since the 1930s, sociologists have been conducting surveys to describe the structures of relationships that unite individuals and groups: their “networks”. These include, for example, advice relationships between employees of a company, or friendship ties between students in a school. These networks can be represented as points (students) united by lines (links).

Figure 1 : a social network of friendship ties between students in a school. Circles = girls, triangles = boys, arrows = ties.
J.L. Moreno, Who shall survive? 1934.

Before any questioning on the social aspects of Facebook and Twitter, this research shed light on, for example, marital role segregation, importance of “weak ties” in job search, informal organization of firms, diffusion of innovations, formation of business elites, social support for the sick or elderly. Designers of digital platforms such as Facebook have picked up some of the analytical principles on which these works were based, developing them with the mathematical theory of graphs (though often with less attention to the social issues involved).

Early on, researchers in this field realized that the traditional principles of research ethics (focusing on informed consent of study participants and anonymization of data) were difficult to ensure. By definition, social networks research is never about a single individual, but about relationships between this individual and others  –  their friends, relatives, collaborators or professional advisors. If the latter are reported by the respondent but are not themselves included in the study, it is difficult to see how their consent could be obtained. What’s more, results can be difficult to anonymize, in that visuals are sometimes disclosive even in the absence of personal identifiers.

Ethics in the digital society: a minefield

Academics have long been thinking about these ethical difficulties, to which a special issue of the prestigious Social Networks journal was dedicated as far back as 2005. Today, researchers’ dilemmas are exacerbated by the increased availability of relational data collected and exploited by digital giants like Facebook or Google. New problems arise as the boundaries between “public” and “private” spheres become confused. To what extent do we need consent to access messages that digital service users send to their contacts, their “retweets”, or their “likes” on their friends’ walls?

These sources of information are often the property of commercial enterprises, and the algorithms they use likely bias observations. For example, can we interpret in the same way a contact created spontaneously by a user, and a contact created as a result of an automated recommendation system? In short, the data do not speak for themselves, and before thinking about their analysis, we must question the conditions of their use and the methods of their production. They largely depend on the software architectures imposed by platforms as well as their economic and technical choices. There is a real power asymmetry between platforms  –  often the property of large multinational companies  –  and researchers  –  especially those working in the public sector, and whose objectives are misaligned with investors’ priorities. Negotiations (if possible at all) are often difficult, resulting in restrictions to proprietary data access  –  particularly penalizing for public research.

Other problems arise as a researcher may even use paid crowdsourcing to produce data, using platforms like Amazon Mechanical Turk to ask large numbers of users to complete a questionnaire, or even to download their online contact lists. But these services raise numerous questions in terms of workers’ rights, working conditions and appropriation of the product of work. The resulting uncertainty hinders research that could otherwise have a positive impact on knowledge and on society at large.

Availability of online communication and publication tools, which many researchers are now seizing, increases the likelihood that research results may be diverted for political or business purposes. If the interest of military and police circles for the analysis of social networks is well known (Osama Bin Laden was allegedly located and neutralised following the application of social network analysis principles), these appropriations are more frequent today, and less easily controllable by researchers. A significant risk is the use of these principles to suppress civic and democratic movements.

Figure 2 : Simulation of the structure of an Al-Qaeda network. Courtesy of the authors.
Kouznetsov A., Tsvetovat M., Social Network Analysis for Startups, 2011

The role of the researcher

Restrictions and prohibitions would likely aggravate the constraints that already weigh on researchers, without helping them overcome these obstacles. Rather, it is important to create conditions for trust and enable researchers to explore the full extent and importance of online and offline social networks  –  allowing them to capture salient economic and social phenomena while remaining respectful of people’s rights. Researchers should take an active role, participating in the co-construction of an adequate ethical framework, grounded in their experience and self-reflective attitude. A bottom-up process involving academics as well as citizens, civil society associations, and representatives of public and private research organizations could then feed these ideas and thoughts back to regulators (such as ethics committees).

Antonio A. Casilli, Associate professor Télécom ParisTech, research fellow Centre Edgar Morin (EHESS)., Télécom ParisTech – Institut Mines-Télécom, Université Paris-Saclay et Paola Tubaro, Chargée de recherche au LRI, Laboratoire de Recherche Informatique du CNRS. Enseignante à l’ENS, Centre national de la recherche scientifique (CNRS)

La version originale de cet article a été publiée sur The Conversation.