Research ethics in the age of digital platforms

I am thrilled to announce the (open access) publication of ‘Research ethics in the age of digital platforms‘ in Science and Engineering Ethics, co-authored with José Luis Molina, Antonio A. Casilli & Antonio Santos Ortega.


We examine the implications of the use of digital micro-working platforms for scientific research. Although these platforms offer ways to make a living or to earn extra income, micro-workers lack fundamental labour rights and ‘decent’ working conditions, especially in the Global South. We argue that scientific research currently fails to treat micro-workers in the same way as in-person human participants, producing de facto a double morality: one applied to people with rights acknowledged by states and international bodies (e.g. Helsinki Declaration), the other to ‘guest workers of digital autocracies’ who have almost no rights at all.

Learners in the loop: hidden human skills in machine intelligence

I am glad to announce the publication of a new article in a special issue of the journal Sociologia del lavoro, dedicated to digital labour.

Today’s artificial intelligence, largely based on data-intensive machine learning algorithms, relies heavily on the digital labour of invisibilized and precarized humans-in-the-loop who perform multiple functions of data preparation, verification of results, and even impersonation when algorithms fail. This form of work contributes to the erosion of the salary institution in multiple ways. One is commodification of labour, with very little shielding from market fluctuations via regulative institutions, exclusion from organizational resources through outsourcing, and transfer of social reproduction costs to local communities to reduce work-related risks. Another is heteromation, the extraction of economic value from low-cost labour in computer-mediated networks, as a new logic of capital accumulation. Heteromation occurs as platforms’ technical infrastructures handle worker management problems as if they were computational problems, thereby concealing the employment nature of the relationship, and ultimately disguising human presence. My just-published paper highlights a third channel through which the salary institution is threatened, namely misrecognition of micro-workers’ skills, competencies and learning. Broadly speaking, salary can be seen as the framework within which the employment relationship is negotiated and resources are allocated, balancing the claims of workers and employers. In general, the most basic claims revolve around skill, and in today’s ‘society of performance’ where value is increasingly extracted from intangible resources and competencies, unskilled workers are substitutable and therefore highly vulnerable. In human-in-the-loop data annotation, tight breakdown of tasks, algorithmic control, and arm’s-length transactions obfuscate the competence of workers and discursively undermine their deservingness, shifting power away from them and voiding the equilibrating role of the salary institution.

Following Honneth, I define misrecognition as the attitudes and practices that result in people not receiving due acknowledgement for their value and contribution to society, in this case in terms of their education, skills, and skill development. Platform organization construes work as having little value, and creates disincentives for micro-workers to engage in more complex tasks, weakening their status and their capacity to be perceived as competent. Misrecognition is endemic in these settings and undermines workers’ potential for self-realization, negotiation and professional development.

My argument is based on original empirical data from a mixed-method survey of human-in-the-loop workers in two previously under-researched settings, namely Spain and Spanish-speaking Latin America.

An openly accessible version of the paper is available from the HAL repository.

Human listeners and virtual assistants: privacy and labor arbitrage in the production of smart technologies

I’m glad to announce the publication of new research, as a chapter in the fabulous Digital Work in the Planetary Market, a volume edited by Mark Graham and Fabian Ferrari and published in open access by MIT Press.

The chapter, co-authored with Antonio A. Casilli, starts by recalling how In spring 2019, public outcry followed media revelations that major producers of voice assistants recruit human operators to transcribe and label users’ conversations. These high-profile cases uncovered the paradoxically labor-intensive nature of automation, ultimate cause of the highly criticized privacy violations.

The development of smart solutions requires large amounts of human work. Sub-contracted on demand through digital platforms and usually paid by piecework, myriad online “micro-workers” annotate, tag, and sort the data used to prepare and calibrate algorithms. These humans are also needed to check outputs – such as automated transcriptions of users’ conversations with their virtual assistant – and to make corrections if needed, sometimes in real time. The data that they process include personal information, of which voice is an example.

We show that the platform system exposes both consumers and micro-workers to high risks. Because producers of smart devices conceal the role of humans behind automation, users underestimate the degree to which their privacy is challenged. As a result, they might unwittingly let their virtual assistant capture children’s voices, friends’ names and addresses, or details of their intimate life. Conversely, the micro-workers who hear or transcribe this information face the moral challenge of taking the role of intruders, and bear the burden of maintaining confidentiality. Through outsourcing, platforms often leave them without sufficient safeguards and guidelines, and may even shift onto them the responsibility to protect the personal data they happen to handle.

Besides, micro-workers themselves release their personal data to platforms. The tasks they do include, for example, recording utterances for the needs of virtual assistants that need large sets of, say, ways to ask about the weather to “learn” to recognize such requests. Workers’ voices, identities and profiles are personal data that clients and platforms collect, store and re-use. With many actors in the loop, privacy safeguards are looser and transparency is harder to ensure. Lack of visibility, not to mention of collective organization, prevents workers from taking action.

Note: Description of one labor-intensive data supply chain. A producer of smart speakers located in the US outsources AI verification to a Chinese platform (1) that relies on a Japanese online service (2) and a Spanish sub-contractor (3) to recruit workers in France (4). Workers are supervised by an Italian company (5), and sign up to a microtask platform managed by the lead firm in the US (6). Source: Authors’ elaboration.

These issues become more severe when micro-tasks are subcontracted to countries where labor costs are low. Globalization enables international platforms to allocate tasks for European and North American clients to workers in Southeast Asia, Africa, and Latin America. This global labor arbitrage goes hand in hand with a global privacy one, as data are channeled to countries where privacy and data protection laws provide uneven levels of protection. Thus, we conclude that any solution must be dual – protecting workers to protect users.

The chapter is available in open access here.

Counting online workers

I have just discovered this very interesting new paper by Otto Kässi, Vili Lehdonvirta and Fabian Stephany. Their data-driven count of online workers is not without reminding of this research published last year, which I did with Clément Le Ludec and Antonio A. Casilli.

There are differences of course: theirs is a large multi-country study while we focused on one national setting (France). Also: Kässi et al. consider online labour in general, while we looked specifically at micro-work.

Nevertheless, there are striking similarities. Both studies included larger as well as smaller and more peripheral platforms, often left aside in previous research. Both started from the numbers of registered users declared by the platforms in scope, although this is likely an upper bound. Indeed registering may not mean using, and for example researchers (like ourselves) and journalists would register only to observe, especially when registration is open and easy.

Also, both studies used web traffic analysis data but for different purposes. We used them as an estimate of minimally active users – those who connect at least monthly, as per the definition given by the providers of these data. For the platforms we observed, these numbers tend to be lower than registrations.

Instead, Kässi et al. have used these data to assess registration numbers for the platforms that do not report them. My first reaction would be to think their estimates are likely a lower bound. But presumably their use of a mix of sources, and the seriousness and caution with which they have conducted their estimate, provide enough correction.

Finally, both studies attempted to correct estimates downward by taking into account multi-homing – the tendency of users to rely on multiple platforms. The coefficient of Kässi et al. is 1.83, ours was 1.27. The gap is due to the fact that we focused only on micro-work: if we had counted participation across all types of online labour platforms, our coefficient would be just below 2 – not far from theirs! Kässi et al. also correct for the possibility of multiple workers using a single account, which we did not observe in our French sample. One might imagine other corrections depending on observed usages. For example my ongoing Latin American study of micro-workers suggests that there are unofficial sales and purchases of highly rated platform accounts, more likely to access better-paying tasks – again, something we did not observe in France. Kässi et al. rightly note that all these corrections come from ad hoc surveys and should be interpreted with caution.

Overall, I would say that both studies point to the need to put in place new and creative methods to account for these new forms of labour that traditional statistical studies fail to capture well. The price to pay, as both studies stress, is a high degree of uncertainty. I also dare suggest that both are mixed-method studies: while the design is essentially quantitative, input from smaller and even qualitative research is crucial – for example to get insight into multi-homing and multi-working.

Before concluding, let us recall the key results. Kässi et al. reckon that there are 163 million freelancer profiles registered on online labour platforms globally, of whom approximately 19 million have worked at least once, and 5 million work more intensely. We estimated that approximately 260,000 French residents are registered with micro-work platforms, of whom some 50,000 are ‘regular’ workers who do micro-tasks at least monthly, and a more restrictive measure of ‘very active’ workers would decrease this figure to 15,000.

Are these numbers large or small? Curiously, our French study attracted both criticisms: some worried that we might be overstating the importance of micro-work, others wondered why we bothered for such a tiny part of national GDP. It is not easy to answer this question, as the answer depends on the perspective taken and the goals – the same numbers would mean different things to policymakers and researchers, for example. Nevertheless, I think that the point that is important to all, is to say that this population exists and needs attention – despite its limited visibility and the fuzzy boundaries that make it so difficult to assess its size.

Internship offer (3 months, master’s level, spring 2021)

The research project TRIA (from its French title “Le TRavail de l’IA: éthique et gouvernance de l’automation”) is a study of the production systems of artificial intelligence. We investigate “micro-work” platforms, which allocate small standardized tasks to crowds of providers, and use the outputs of their work to prepare and annotate data for machine learning algorithms. We study the ramifications of this phenomenon in Spanish-speaking countries, which have remained under-researched so far despite their strong participation. With data from an empirical survey already started in 2020, and to be analyzed through mixed methods (including advanced NLP techniques), we will address important issues related to digital platform governance, online work ethics, and consequences (e.g. in terms of bias) of the use of these humans in the production of artificial intelligence.
Funded by the French National Center of Scientific Research (CNRS), the TRIA project resembles research teams in the Paris and Rennes regions in France, as well as partners in Spain (Barcelona and Valencia) and Canada (Toronto).

We are currently looking for a student intern to help us set up a survey targeting micro-workers in Spain and Spanish-speaking Latin American countries.
He/she will help us to :

  • update an inventory of micro-work platforms operating in Spanish-speaking countries, a first version of which was created in 2020;
  • launch a replication of the online questionnaire, already fielded on Microworkers.com, on another micro-work platform;
  • to liaise and ensure communication between the project teams.

The applicant should :

  • be enrolled in the first or second year of a master’s degree in social science (like sociology, political science, management or economics) ;
  • have skills in the design and/or execution of questionnaire surveys;
  • have some prior knowledge of, or at least interest in, the transformations of work and/or the societal effects of digital technology;
  • be able to work independently, with advanced relational skills;
  • have a fairly good command of French or English, and at least a basic knowledge of Spanish.

More information is available in the enclosed job description.

Crowdworking Symposium 2020

With Antonio A. Casilli, I will be presenting a paper tomorrow at the Crowdworking Symposium organized by the University of Paderborn, Germany. Unfortunately, we will participate only online because of the health situation.

Our mini-paper (3 pages), entitled ‘Portraits of micro-workers: The real people behind AI in France’, is available here.

How many ‘micro-workers’?

Finally published! Counting `micro-workers’: Societal and methodological challenges around new forms of labour is a paper that I co-authored with Clément Le Ludec and Antonio A. Casilli, and that hs just been published in a special issue of the journal Work Organisation, Labour & Globalisation.

What is it about? ‘Micro-work’ consists of fragmented data tasks that myriad providers execute on online platforms. While crucial to the development of data-based technologies, this little visible and geographically spread activity is particularly difficult to measure. To fill this gap, we combined qualitative and quantitative methods (online surveys, in-depth interviews, capture-recapture techniques, and web traffic analytics) to count micro-workers in a single country, France. On the basis of this analysis, we estimate that approximately 260,000 people are registered with micro-work platforms. Of these some 50,000 are ‘regular’ workers who do micro-tasks at least monthly, and we speculate that using a more restrictive measure of ‘very active’ workers decreases this figure to 15,000. This analysis is important to better understand platform labour and the labour in the digital economy that lies behind artificial intelligence.

The trainer, the verifier, the imitator: Three ways in which human platform workers support artificial intelligence

New article, co-authored with Antonio A. Casilli and Marion Coville, just published in Big Data & Society!

The paper sheds light on the role of digital platform labour in the development of today’s artificial intelligence, predicated on data-intensive machine learning algorithms. We uncover the specific ways in which outsourcing of data tasks to myriad ‘micro-workers’, recruited and managed through specialized platforms, powers virtual assistants, self-driving vehicles and connected objects. Using qualitative data from multiple sources, we show that micro-work performs a variety of functions, between three poles that we label, respectively, ‘artificial intelligence preparation’, ‘artificial intelligence verification’ and ‘artificial intelligence impersonation’. Because of the wide scope of application of micro-work, it is a structural component of contemporary artificial intelligence production processes – not an ephemeral form of support that may vanish once the technology reaches maturity stage. Through the lens of micro-work, we prefigure the policy implications of a future in which data technologies do not replace human workforce but imply its marginalization and precariousness.

The three main functions of micro-work in the development of data-intensive, machine-learning based AI solutions.

The paper reports results of the 2017-18 DiPLab project, and is available here in open access.

Internship offer, TRIA project

I am currently seeking to hire a student intern for new research project TRIA (Les TRavailleurs de l’Intelligence Artificielle / Los TRabajadores de la Inteligencia Artificial). Start as soon as possible, conditional on evolving regulations at the end of the current lockdown. Max 6 months.

A full description of the project is enclosed (in French).

Microwork platforms: a challenge for artificial intelligence, a challenge for employment?

With our sponsors France Stratégie and MSH Paris-Saclay, we convene an international conference on micro-work in Paris on June 13, 2019, followed by the first INDL (International Network on Digital Labor) workshop on June 14. The event will include a “meet the microworkers” panel on June 13, where workers, platform owners and client companies will take the stage. There will also be presentations of the results of national and international surveys (notably ours, DiPLab) on these emerging forms of work, and discussions with French and international academic and institutional experts.

Bandeau-SITE-Microtravail-FR-1280x640

After Uber, Deliveroo and other on-demand services, micro-work is a new form of labor mediated by digital platforms. Internet and mobile services recruit crowds to perform small, standardized and repetitive tasks on behalf of corporate clients, in return for fees ranging from few cents to few euros. These tasks generally require low skills: taking a picture in a store, recognizing and classifying images, transcribing bits of text, formatting an electronic file… Despite their apparent simplicity, these micro-tasks performed by millions of people around the world, are crucial to create the databases needed to calibrate and “train” artificial intelligence algorithms.

Internationally, Amazon Mechanical Turk is the most widely known micro-work platform. In France and in French-speaking Africa, other platforms are attracting a growing number of workers to supplement or even substitute for their primary income. How widespread is the phenomenon? How to recognize, organize and regulate this new form of work? How, finally, does it relate to traditional forms of employment?

Presentations and discussions are held in French and English, with simultaneous translation.

The programme is available here.