Three to five dollars: that’s the answer. As simple as that. I am talking about the behind-the-curtain market for personal data that sustains machine learning technologies, specifically for the development of face recognition algorithms. To train their models, tech companies routinely buy selfies as well as pictures or videos of ID documents from little-paid micro-workers, mostly from lower-income countries such as Venezuela and the Philippines.

Josephine Lulamae of Algorithm Watch interviewed me for a comprehensive report on the matter. She shows how, in this globalized market, the rights of workers are hardly respected – both in terms of labour rights and of data protection provisions.

I saw many such cases in my research of the last two years, as I interviewed people in Venezuela who do micro-tasks on international digital platforms for a living. Their country is affected by a terrible economic and political crisis, with skyrocketing inflation, scarcity of even basic goods, and high emigration. Under these conditions, international platforms – that pay little, but in hard currency – have seen a massive inflow of Venezuelans since about 2017-18.

Some of the people I interviewed just could not afford to refuse a task paid five dollars – at a moment in which the monthly minimum wage of Venezuela was plummeting to as little as three dollars. They do tasks that workers in richer countries such as Germany and the USA refuse to do, according to Lulamae’s report. Still, even the Venezuelans did not always feel comfortable doing tasks that involved providing personal data such as photos of themselves. One man told me that before, in better conditions, he would not have done such a task. Another interviewee told me that in an online forum, there were discussions about someone who had accepted to upload some selfies and later found his face in an advertisement on some website, and had to fight hard to get it removed. I had no means to fact-check whether this story was true, but the very fact that it circulated among workers is a clear sign that they worry about these matters. 

On these platforms that operate globally, personal data protection does not work very well. This does not mean that clients openly violate the law: for example, workers told me they had to sign consent forms, as prescribed in the European General Data Protection Regulation (GDPR). However, people who live outside of Europe are less familiar with this legislation (and sometimes, with data protection principles more generally), and some of my interviewees did not well understand consent forms. More importantly, they have few means to contact clients, who typically avoid revealing their full identity on micro-working platforms – and therefore, can hardly exert their rights under GDPR (right to access, to rectification, to erasure etc.).

The rights granted by GDPR are comprehensive, but do not include property rights. The European legislator did not create a framework in which personal data to be sold and bought, and rather opted for guaranteeing inalienable rights to each and every citizen. However, this market exists and is flourishing, to the extent that it is serving the development of state-of-the-art technologies. Its existence is problematic, like the ‘repugnant’ markets for, say, human organs or babies for adoption, where moral arguments effectively counter economic interest. It is a market that thrives on global inequalities, and reminds of the high price to pay for today’s technical progress.

Ciclo de charlas en Chile sobre inteligencia artificial, trabajo y redes sociales

Estoy muy emocionada y feliz de empezar un ciclo de charlas en Chile, principalmente en Santiago y Talca, con Antonio A. Casilli este mes de enero. Agradezco mucho a la Embajada de Francia en Chile, al Instituto Francés de Chile, y a la Fundación Teatro a Mil por esta oportunidad maravillosa. Gracias también a Juana Torres Cierpe y a Francisca Ortiz Ruiz por su ayuda en contactar con colegas, amigos y estudiantes de Chile.

Empezaremos por una charla titulada “Plataformas digitales, trabajo en línea y automatización tras la crisis sanitaria”, que tendrá lugar el día lunes 16 de enero a las 12:00 hrs en la sede de la CUT (1 oriente # 809, Talca). En esta charla presentaremos nuestras investigaciones sobre el fenómeno del micro-trabajo fuertemente precarizado que se desarrolla en las plataformas digitales. Agradezco mucho a la profesora Claudia Jordana Contreras y a la Escuela de Sociología de la Universidad Católica del Maule por la organización de este evento.

El martes 17 enero 2023, 11:00, hablaré de “Inteligencia artificial, transformaciones laborales y desigualdades: El trabajo de las mujeres en las plataformas digitales de ‘microtareas” en el Instituto de Sociología de la Universidad Católica y con el Quantitative and Computational Social Science Research Group. Gracias a Mauricio Bucca que ha organizado este evento. Estaremos en la Pontificia Universidad Católica de Chile, Campus San Joaquín.

El martes 17 por la tarde (a las 17:000 hrs), hablaré de “Ética de la inteligencia artificial y otros desafíos para la investigación sobre redes sociales” como parte de la Escuela de Verano del Centro de Investigación en Complejidad Social, Universidad del Desarrollo. Agradezco a Jorge Fábrega Lacoa y sus colegas para la organización.

El martes 17 a las 10:000 hrs, también habrá una ponencia de Antonio Casilli en el evento Congreso Futuro: “Trabajo global y inteligencia artificial. Los ‘ingredientes humanos’ ed la automatización” (Teatro Oriente, Pedro de Valdivia 099, Providencia).

El viernes 20 de enero 2023, a las 10:00 hrs, Antonio y yo hablaremos juntos de “El trabajo detrás de la inteligencia artificial y la automatización en América Latina” en un taller internacional organizado por la Universidad de Chile – con Pablo Pérez (gracias por la organización!) y Francisca Gutiérrez, sala 129, FASCO, Av. Ignacio Carrera Pinto 1045, Ñuñoa.

Sigue un evento organizado por el Instituto Francés, “La noche de las ideas”:

Viernes 20 enero 2023, 20:00 hrs, Centro cultural La Moneda, Noche de las Ideas, Santiago — Paola Tubaro “Automatización: ¿El fin del humano?” (con con Denis Parra y Javier Ibacache, Plaza de la Ciudadanía 26, Santiago).

Sabado 21 enero 2021, 16:00 hrs, Centro cultural La Moneda, Noche de las Ideas, Santiago — Antonio Casilli “¿Qué esconde la inteligencia artificial?” (con José Ulloa, Constanza Michelson y Paula Escobar, Plaza de la Ciudadanía 26, Santiago).

El miércoles 26 de enero 2023, a las 18:30 hrs en Santiago, habrá la presentación del libro de Antonio Casilli, “Esperando a los robots. Investigación sobre el trabajo del clic” (LOM, 2021) (con Paulo Slachevsky, Librería del Ulises Lastarria, José Victorino Lastarria 70, local 2, Paseo Barrio Lastarria).

A successful INDL-5 conference!

On 3-5 November 2022, I was at the department of Sociology of National and Kapodistrian University of Athens (NKUA) for the 5th INDL conference “Features and Futures of Digital Labor”. The conference was co-organized by us (the DiPLab project at the Polytechnic Institute of Paris) together with the International Labor Organization (ILO) and the Labor Institute of the General Confederation of Greek Workers.

The INDL (International Network on Digital Labor) project started as ENDL (the “E” standing for “European”) 5 years ago with an inaugural meeting in Paris. Since then, it has expanded internationally, and its members organized larger conferences in Paris (2019), Toronto (2019), Milan (2020, online), and Edinburgh (2021, online). INDL’s conference in Athens was the first in-person meeting since the beginning of the Covid-19 pandemic.

The key idea behind the creation of INDL, and the organization of these conferences, is that digital labor is central to the digital transformation of society. Despite its pervasiveness, though, the ways it is inscribed in the current organization of production and the state remain elusive. Different fields of the social and economic sciences, political theory, law, and philosophy have attempted to capture its distinctive attributes. The group’s initiatives contribute to this conversation by mapping the new working environments and fostering dialogue around the nature of digital work and the possible futures that academic research may help bring about.

Artificial Intelligence and Globalization: Data Labor  and Linguistic Specificities (AIGLe)

We organized the one-day conference AIGLe on 27 October 2022 to present the outcomes of interdisciplinary research conducted by our DiPLab teams in French-speaking African countries (ANR HuSh Project) and Spanish-speaking countries in Latin America (CNRS-MSH TrIA Project). Both initiatives study the human labor necessary to generate and annotate the data needed to produce artificial intelligence, to check outputs, and to intervene in real time when algorithms fail. Researchers from economics, sociology, computer science, and linguistics shared exciting new results and discussed them with the audience.

AIGLe is part of the project HUSh (The HUman Supply cHain behind smart technologies, 2020-2024), funded by ANR, and the research project TRIA (The Work of Artificial Intelligence, 2020-2022), co-financed by the CNRS and the MSH Paris Saclay. This event, under the aegis of the Institut Mines-Télécom, was organized by the DiPLab team with support of ANR, MSH Paris-Saclay and the Ministry of Economy and Finance.

Session 1 – Maxime Cornet & Clément Le Ludec (IP Paris, ANR HUSH Project): Unraveling the AI Production Process: How French Startups Externalise Data Work to Madagascar. Discussant: Mohammad Amir Anwar (U. of Edinburgh)
Anwar (U. of Edinburgh)

Session 2 – Chiara Belletti and Ulrich Laitenberger (IP Paris, ANR HUSH Project): Worker Engagement and AI Work on Online Labor Markets. Discussant: Simone Vannuccini (U. of Sussex)

Session 3 – Juana-Luisa Torre-Cierpe (IP Paris, TRIA Project) & Paola Tubaro (CNRS, TRIA Project): Uninvited Protagonists: Venezuelan Platform Workers in the Global Digital Economy. Discussant: Maria de los Milagros Miceli (Weizenbaum Institut)
Maria de los Milagros Miceli (Weizenbaum Institut)

Session 4 – Ioana Vasilescu (CNRS, LISN, TRIA Project), Yaru Wu (U. of Caen, TRIA Project) & Lori Lamel (LISN CNRS): Socioeconomic profiles embedded in speech : modeling linguistic variation in micro-workers interviews. Discussant: Chloé Clavel (Télécom Paris, IP Paris)
micro-workers interviews
. Discussant: Chloé Clavel (Télécom Paris, IP Paris)

Learners in the loop: hidden human skills in machine intelligence

I am glad to announce the publication of a new article in a special issue of the journal Sociologia del lavoro, dedicated to digital labour.

Today’s artificial intelligence, largely based on data-intensive machine learning algorithms, relies heavily on the digital labour of invisibilized and precarized humans-in-the-loop who perform multiple functions of data preparation, verification of results, and even impersonation when algorithms fail. This form of work contributes to the erosion of the salary institution in multiple ways. One is commodification of labour, with very little shielding from market fluctuations via regulative institutions, exclusion from organizational resources through outsourcing, and transfer of social reproduction costs to local communities to reduce work-related risks. Another is heteromation, the extraction of economic value from low-cost labour in computer-mediated networks, as a new logic of capital accumulation. Heteromation occurs as platforms’ technical infrastructures handle worker management problems as if they were computational problems, thereby concealing the employment nature of the relationship, and ultimately disguising human presence. My just-published paper highlights a third channel through which the salary institution is threatened, namely misrecognition of micro-workers’ skills, competencies and learning. Broadly speaking, salary can be seen as the framework within which the employment relationship is negotiated and resources are allocated, balancing the claims of workers and employers. In general, the most basic claims revolve around skill, and in today’s ‘society of performance’ where value is increasingly extracted from intangible resources and competencies, unskilled workers are substitutable and therefore highly vulnerable. In human-in-the-loop data annotation, tight breakdown of tasks, algorithmic control, and arm’s-length transactions obfuscate the competence of workers and discursively undermine their deservingness, shifting power away from them and voiding the equilibrating role of the salary institution.

Following Honneth, I define misrecognition as the attitudes and practices that result in people not receiving due acknowledgement for their value and contribution to society, in this case in terms of their education, skills, and skill development. Platform organization construes work as having little value, and creates disincentives for micro-workers to engage in more complex tasks, weakening their status and their capacity to be perceived as competent. Misrecognition is endemic in these settings and undermines workers’ potential for self-realization, negotiation and professional development.

My argument is based on original empirical data from a mixed-method survey of human-in-the-loop workers in two previously under-researched settings, namely Spain and Spanish-speaking Latin America.

An openly accessible version of the paper is available from the HAL repository.

Human listeners and virtual assistants: privacy and labor arbitrage in the production of smart technologies

I’m glad to announce the publication of new research, as a chapter in the fabulous Digital Work in the Planetary Market, a volume edited by Mark Graham and Fabian Ferrari and published in open access by MIT Press.

The chapter, co-authored with Antonio A. Casilli, starts by recalling how In spring 2019, public outcry followed media revelations that major producers of voice assistants recruit human operators to transcribe and label users’ conversations. These high-profile cases uncovered the paradoxically labor-intensive nature of automation, ultimate cause of the highly criticized privacy violations.

The development of smart solutions requires large amounts of human work. Sub-contracted on demand through digital platforms and usually paid by piecework, myriad online “micro-workers” annotate, tag, and sort the data used to prepare and calibrate algorithms. These humans are also needed to check outputs – such as automated transcriptions of users’ conversations with their virtual assistant – and to make corrections if needed, sometimes in real time. The data that they process include personal information, of which voice is an example.

We show that the platform system exposes both consumers and micro-workers to high risks. Because producers of smart devices conceal the role of humans behind automation, users underestimate the degree to which their privacy is challenged. As a result, they might unwittingly let their virtual assistant capture children’s voices, friends’ names and addresses, or details of their intimate life. Conversely, the micro-workers who hear or transcribe this information face the moral challenge of taking the role of intruders, and bear the burden of maintaining confidentiality. Through outsourcing, platforms often leave them without sufficient safeguards and guidelines, and may even shift onto them the responsibility to protect the personal data they happen to handle.

Besides, micro-workers themselves release their personal data to platforms. The tasks they do include, for example, recording utterances for the needs of virtual assistants that need large sets of, say, ways to ask about the weather to “learn” to recognize such requests. Workers’ voices, identities and profiles are personal data that clients and platforms collect, store and re-use. With many actors in the loop, privacy safeguards are looser and transparency is harder to ensure. Lack of visibility, not to mention of collective organization, prevents workers from taking action.

Note: Description of one labor-intensive data supply chain. A producer of smart speakers located in the US outsources AI verification to a Chinese platform (1) that relies on a Japanese online service (2) and a Spanish sub-contractor (3) to recruit workers in France (4). Workers are supervised by an Italian company (5), and sign up to a microtask platform managed by the lead firm in the US (6). Source: Authors’ elaboration.

These issues become more severe when micro-tasks are subcontracted to countries where labor costs are low. Globalization enables international platforms to allocate tasks for European and North American clients to workers in Southeast Asia, Africa, and Latin America. This global labor arbitrage goes hand in hand with a global privacy one, as data are channeled to countries where privacy and data protection laws provide uneven levels of protection. Thus, we conclude that any solution must be dual – protecting workers to protect users.

The chapter is available in open access here.

Hidden inequalities: the gendered labour of women on micro-tasking platforms

Around the world, myriad workers perform data tasks on online labour platforms to fuel the digital economy. Mostly short, repetitive and little paid, these so-called ‘micro-tasks’ include for example labelling objects in images, classifying tweets, recording utterances, and transcribing audio files – notably to satisfy the data appetite of today’s fast-growing artificial intelligence industry. While casualization of labour and low pay have attracted sharp criticisms against these platforms, they appear gender-blind and accessible even to people with basic skills. Women with care or household duties may particularly benefit from the time flexibility and the possibility to work from home that platforms offer. So, are these new labour arrangements gender equalizers after all?

In a new paper co-authored with Marion Coville, Clément Le Ludec and Antonio A. Casilli, we demonstrate that this new form of online labour fails to fill gender gaps, and may even exacerbate them. We proceed in three steps. First, we show that legacy inequalities in the professional and domestic spheres turn platform-mediated micro-tasking into a ‘third shift’ that adds to already heavy schedules. Both working fathers and working mothers experience it, but the structure of the other two shifts affects their experience. Looking at their time use, it turns out that men dedicate long and uninterrupted slots of time to each activity: their main job, their share of household duties, leisure and micro-work. They tend to do all micro-tasks in a row, usually at night after work or in the morning before starting. Instead, women have more fragmented schedules, and micro-work during short breaks, here and there, eating into their leisure time. This is one reason why they earn less on platforms: they have short slots of time available, so they cannot search for better-paid tasks, and just content themselves with whatever is available at that moment.

Time use of typical female (left) and male (right), micro-workers, both of whom have a main job in addition to platform micro-tasks, and dependent children.

Second, we submit that the human capital of male and female data workers differ, with women less likely to have received training in science and technology fields.

Highest educational qualification (left) and discipline of specialization (right) of men and women micro-workers. Data collected in France, 2018 (n = 908).

Third, their social capital differs: using a position generator instrument to capture workers’ access to the informational and support resources that may come from contacts with people in different occupations, we show that women have fewer ties to digital-related professionals who could provide them with knowledge and advice to successfully navigate the platform world.

Gender assortativity index for each occupation in the 48-item position generator that measures respondents’ social capital. Each panel represents respondents’ choices, ordered from lowest (negative) to highest (positive) degree of similarity. Top panel: female respondents, bottom panel: male respondents. The bars corresponding to digital and computing occupations are hatched.

Taken together, these factors leave women with fewer career prospects within a tech-driven workforce, and reproduce relegation of women to lower-level computing work as observed in the history of twentieth-century technology. 

The full paper is available in open access here.

It is part of a full special issue of Internet Policy Review on ‘The gender of the platform economy‘, guest-edited by M. Fuster Morell, R. Espelt and D. Megias.

Counting online workers

I have just discovered this very interesting new paper by Otto Kässi, Vili Lehdonvirta and Fabian Stephany. Their data-driven count of online workers is not without reminding of this research published last year, which I did with Clément Le Ludec and Antonio A. Casilli.

There are differences of course: theirs is a large multi-country study while we focused on one national setting (France). Also: Kässi et al. consider online labour in general, while we looked specifically at micro-work.

Nevertheless, there are striking similarities. Both studies included larger as well as smaller and more peripheral platforms, often left aside in previous research. Both started from the numbers of registered users declared by the platforms in scope, although this is likely an upper bound. Indeed registering may not mean using, and for example researchers (like ourselves) and journalists would register only to observe, especially when registration is open and easy.

Also, both studies used web traffic analysis data but for different purposes. We used them as an estimate of minimally active users – those who connect at least monthly, as per the definition given by the providers of these data. For the platforms we observed, these numbers tend to be lower than registrations.

Instead, Kässi et al. have used these data to assess registration numbers for the platforms that do not report them. My first reaction would be to think their estimates are likely a lower bound. But presumably their use of a mix of sources, and the seriousness and caution with which they have conducted their estimate, provide enough correction.

Finally, both studies attempted to correct estimates downward by taking into account multi-homing – the tendency of users to rely on multiple platforms. The coefficient of Kässi et al. is 1.83, ours was 1.27. The gap is due to the fact that we focused only on micro-work: if we had counted participation across all types of online labour platforms, our coefficient would be just below 2 – not far from theirs! Kässi et al. also correct for the possibility of multiple workers using a single account, which we did not observe in our French sample. One might imagine other corrections depending on observed usages. For example my ongoing Latin American study of micro-workers suggests that there are unofficial sales and purchases of highly rated platform accounts, more likely to access better-paying tasks – again, something we did not observe in France. Kässi et al. rightly note that all these corrections come from ad hoc surveys and should be interpreted with caution.

Overall, I would say that both studies point to the need to put in place new and creative methods to account for these new forms of labour that traditional statistical studies fail to capture well. The price to pay, as both studies stress, is a high degree of uncertainty. I also dare suggest that both are mixed-method studies: while the design is essentially quantitative, input from smaller and even qualitative research is crucial – for example to get insight into multi-homing and multi-working.

Before concluding, let us recall the key results. Kässi et al. reckon that there are 163 million freelancer profiles registered on online labour platforms globally, of whom approximately 19 million have worked at least once, and 5 million work more intensely. We estimated that approximately 260,000 French residents are registered with micro-work platforms, of whom some 50,000 are ‘regular’ workers who do micro-tasks at least monthly, and a more restrictive measure of ‘very active’ workers would decrease this figure to 15,000.

Are these numbers large or small? Curiously, our French study attracted both criticisms: some worried that we might be overstating the importance of micro-work, others wondered why we bothered for such a tiny part of national GDP. It is not easy to answer this question, as the answer depends on the perspective taken and the goals – the same numbers would mean different things to policymakers and researchers, for example. Nevertheless, I think that the point that is important to all, is to say that this population exists and needs attention – despite its limited visibility and the fuzzy boundaries that make it so difficult to assess its size.

The platform economy, labour and Covid-19

On 18 September 2020, I present my research on the platform economy and its impact on labour in Covid-19 times at Nantes Digital Week, as part of a special event organized by CGT, a Union.

The mobility restrictions that accompanied the pandemic encouraged use of digital tools to socialize, study and work, suggesting that automation is gaining ground and that technology enables contactless – hence safe – interactions in much of our social life. Yet behind apparent automation, precarious and unprotected human labour is hidden. Workers recruited through digital platforms to make these solutions work, are in fact disproportionately exposed to risks. I illustrate these ideas in three main cases: food delivery workers, that enabled the restaurant industry to stand the crisis even during lockdown; commercial content moderators that are to return to office sooner than others, to protect our safety online; and AI micro-workers who trained tools whose sales have gone up during stay-at-home rules, such as voice assistants, and helped the creation of datasets for much-needed health applications.

How many ‘micro-workers’?

Finally published! Counting `micro-workers’: Societal and methodological challenges around new forms of labour is a paper that I co-authored with Clément Le Ludec and Antonio A. Casilli, and that hs just been published in a special issue of the journal Work Organisation, Labour & Globalisation.

What is it about? ‘Micro-work’ consists of fragmented data tasks that myriad providers execute on online platforms. While crucial to the development of data-based technologies, this little visible and geographically spread activity is particularly difficult to measure. To fill this gap, we combined qualitative and quantitative methods (online surveys, in-depth interviews, capture-recapture techniques, and web traffic analytics) to count micro-workers in a single country, France. On the basis of this analysis, we estimate that approximately 260,000 people are registered with micro-work platforms. Of these some 50,000 are ‘regular’ workers who do micro-tasks at least monthly, and we speculate that using a more restrictive measure of ‘very active’ workers decreases this figure to 15,000. This analysis is important to better understand platform labour and the labour in the digital economy that lies behind artificial intelligence.