Organising AI data workers: barriers, alternatives, and ways forward

The work that fuels AI – from data labelling to content moderation, output checking, red teaming, and so on – is typically outsourced. Digital platforms that operate as online marketplaces play a pivotal role in making this possible. They extend outsourcing to individuals, removing the informational bottlenecks that previously limited it to (multi-person) firms. Platforms treat workers as independent contractors and do not guarantee labour rights. Job insecurity, income volatility, wage theft, and in some cases mental health issues, are common. However, cases of worker mobilisation remain rare. Why, and how can this be changed?

Barriers

A specific challenge that arises on platforms is the asymmetric distribution of work, with a relatively small number of users doing most tasks, and a long tail of minimally-active (or even inactive) people. The reason is that registration is (more or less) open but demand is variable, so that a worker must beat the competition to find tasks to do. This has two main implications. One is that from start, there is an incentive to see other workers as competitors rather than colleagues. The other is that it is difficult to motivate people in the long tail to take action: they are more likely to exit than to voice their grievances.

Lack of a shared worker identity is another crucial gap. Data work was initially portrayed as simple and straightforward, and even sometimes considered as a form of consumption or leisure. Many platforms carefully avoid even using the word ‘worker’, instead preferring terms like ‘Turkers’ (Amazon Mechanical Turk) or ‘Tolokers’ (Toloka). The very fact that workers themselves often take the rhetoric of simple tasks at face value, and struggle to see themselves as such, is indicative of their experience of disrespect, due to widespread misrecognition.

Juliet Schor writes that “platform earners are not only independent in a legal sense; they also typically do their work independently of other workers”. Technology enables extreme fragmentation of labour and rules out teamwork. Neither do workers ever meet their clients (technology producers), due to platform intermediation. In sum, platform data work isolates workers both from their peers and from other stakeholders (as the above picture cleverly represents). How to organise if you are alone?

Alternatives

In this context, it is useful to broaden our understanding of worker organisation. Beyond collective acts undertaken within an institutionalised framework, we should also embrace informal, unorganised and subtle actions, which can nevertheless lead to positive outcomes.

In crisis-stricken Venezuela, very large numbers of people started data work on online platforms to earn much-needed hard currency. Here, workers have leveraged their personal networks of family and friends to surmount the multiple obstacles posed by platform work. They never created any official organisation, and their actions would rarely qualify as forms of resistance. Some were mere attempts to limit losses in a harshly competitive environment. As researchers, we need to be mindful of cases in which, owing to an unfavourable context, workers prioritise the (short-run) need to counter local scarcity through online earnings, rather than any (longer-run) fight against unfavourable platform management. (More on this case here.)  

Ways forward

There are nevertheless signs that successful strategies exist. Kenya is a rare example of organisation: data workers and content moderators in this country initiated actions that attracted international attention and, as in a virtuous circle, support. Of course, not all is easy for them, but Kenya is now a reference, an example for everyone else. This suggests that it is essential to give visibility to workers’ conditions and to any action they undertake to defend their rights.

The other lesson learned from the Kenyan case is that collaborations between multiple stakeholders can achieve a lot in supporting workers and triggering change. Not only established unions, but also researchers, policymakers, and activists in various NGOs (for example, engaged for personal data protection, against discrimination, etc.) can act as multipliers of the resources available to workers.

American AI, made in Venezuela

The political tensions between Venezuela and the United States are at an all-time high, after news of strikes on Caracas and the alleged capture of President Maduro and his wife. The recent escalation follows years of economic sanctions and a deep divide between the two countries.

And yet, Venezuela has massively supplied cheap data work to US-based technology producers throughout all these years. Through digital labour platforms, an educated but impoverished workforce made its way to (the bottom of) the supply chains of US-directed artificial intelligence (AI).

Since about 2017, high inflation, increasing scarcity of even basic goods, and widespread poverty have pushed Venezuelans to work for international platforms that pay in US dollars, albeit at low rates. They have come to constitute a large reservoir for technology producers, mainly (though not only) in the United States. Known for their willingness to accept even the lowest pay rates in the data work market, Venezuelan workers have annotated hundreds of thousands of videos and images for the development of (for example) self-driving vehicles. Ironically, the very policies of Chavist governments – from Chávez himself to Maduro – made this possible. Cheap access to electricity and promotion of digital literacy, including through the widespread distribution of locally produced computers (‘Canaima’) to students and schoolchildren, provided people with the necessary infrastructure to perform data work. Even outdated and malfunctioning, these equipments played a crucial role in enabling widespread Venezuelan participation to the AI pipeline.

Nicolas Gourault (2020). VO: A documentary and sensory investigation about the role of human workers in the training of driverless cars. Source: https://nicolasgourault.fr/films/vo

For Venezuelan workers, platforms labour has constituted a a resilience strategy against adverse local conditions. Participation has never been easy owing to frequent power cuts, slow internet connection, and aging devices, not to mention the difficulty of working almost entirely in English. The high educational levels and computing skills of many workers (including experienced professionals and science/technology students), and embeddedness in densely knit networks of support offered solutions. At the same time, work on platforms is not without challenges, and all Venezuelan data workers have experienced some form of disrespect. Being paid less than peers in neighbouring countries, or even being offered fewer tasks than these foreign peers, are examples of this. At some point, they had to endure a widespread perception that they do not work well. Resisting against international platforms can be more challenging than bypassing local restrictions, and Venezuelan workers limited themselves to occasional acts of minor cheating involving only very few of them.

Venezuelans’ resolve to move out of the crisis and the networked relationships that sustain each of them against hardship have made them massively present, as ‘uninvited protagonists,’ in international data work platforms. Conversely, some AI companies and platforms (from the Global North in general, and from the United States specifically) targeted Venezuela deliberately, not much for the qualities and skills of its highly educated population, but for its low cost at a time of crisis. These platform-mediated encounters enabled short-term solutions, but haven’t raised Venezuela out of poverty, and haven’t ensured a durable provision of high-quality data for AI.

What comes next inside Venezuela is deeply unclear, but unfortunately, nothing (for now) suggests any recognition of the role of these workers in the technology industry, or any opportunity to reshape its outputs in more equitable and respectful ways.

Uninvited protagonists: the social networks of Venezuelan AI data workers

After years of work, the long-awaited good news: my article ‘Uninvited Protagonists: The Networked Agency of Venezuelan Platform Data Workers‘, co-authored with Juana Torres-Cierpe, has just been published in New Technology, Work and Employment!

Workers in Venezuela are powering AI production, often under tough conditions. Sanctions and a deep political-economic crisis have pushed them to work for platforms that pay in US dollars, albeit at low rates. They constitute a large reservoir for technology producers from rich countries. But they are not passive players.

They build resilience, rework their environment, and sometimes engage in acts of resistance, with support from different segments of their personal networks. From strong local ties to loose online connections, these informal webs help them cope, adapt, and occasionally push back. Their diversified relationships comprise an unofficial and often hidden, albeit largely digitised relational infrastructure that sustains their work and shapes collective action.

These findings invite to rethink agency as embedded in workers’ personal networks. To respond to adversities, one must liaise with equally affected peers, with family and friends who offer support, etc. Social ties ultimately determine who is enabled to respond, and who is not; whether any benefits and costs are shared, and with whom; whether any solution will be conflictual or peaceful. Social networks are not accessory but constitute the very channel through which Venezuelan data workers cope with hardship.

Not all relationships play the same role, though. Venezuelans discover online data work through their strong ties with family, close friends, and neighbours. To convert their online earnings into local currency, they rely on their broader social networks of relatives and friends living abroad and indirect relationships with intermediaries. For managing their day-to-day activities, Venezuelans expand their social networks through online services like Facebook, WhatsApp, and Telegram, connecting with diverse and less-close peers within and outside the country. Different social ties affect the various stages of the data working experience.

Overall, no Venezuelan could work alone – and the networked interactions that sustain each of them against hardship have made them massively present, as ‘uninvited protagonists,’ in international platforms. Their massive presence in the planetary data-tasking market is a supply rather than demand-driven phenomenon.

This analysis also sheds light on the reasons why mobilisation is uncommon among platform data workers. Other studies noted diverging orientations of workers, unclear goals, lack of focus, and insufficient leadership. Another powerful reason hinges upon the predominance of weak ties in building up online group membership: indeed, distant acquaintances are insufficient to prompt people to action if their intrinsic motivations are low.

The article is available in open access here.

The digital labour of AI in Latin America

Another article has just been published! Another one that is based on a DiPLab-based group collaboration (with A.A. Casilli, M. Fernández Massi, J. Longo, J. Torres Cierpe and M. Viana Braz) and that uses data from multiple countries. It is entitled ‘The digital labour of artificial intelligence in Latin America: a comparison of Argentina, Brazil, and Venezuela’ and is part of a special issue of Globalizations on ‘The Political Economy of AI in Latin America’. This article lifts the veil on the precarious and low-paid data workers who, from Latin America, engage in AI preparation, verification, and impersonation, often for foreign technology producers. Focusing on three countries (Argentina, Brazil, and Venezuela), we use original mixed-method data to compare and contrast these cases in order to reveal common patterns and expose the specificities that distinguish the region.

The analysis unveils the central place of Latin America in the provision of data work. To bring costs down, AI production thrives on countries’ economic hardship and inequalities. In Venezuela and to a lesser extent Argentina, acute economic crisis fuels competition and favours the emergence of ‘elite’ (young and STEM-educated) data workers, while in more stable but very unequal Brazil, this activity is left to relatively underprivileged segments of the workforce. AI data work also redefines these inequalities insofar as, in all three countries, it blends with the historically prevalent informal economy, with workers frequently shifting between the two. There are spillovers into other sectors, with variations depending on country and context, which tie informality to inequality.


Our study has policy implications at global and local levels. Globally, it calls for more attention to the conditions of AI production, especially workers’ rights and pay. Locally, it advocates solutions for the recognition of skills and experience of data workers, in ways that may support their further professional development and trajectories, possibly also facilitating some initial forms of worker organization.


The version of record is here, while an open-access preprint is available here.

Where does AI come from?

I am thrilled to announce that an important article has just seen the light. Entitled ‘Where does AI come from? A global case study across Europe, Africa, and Latin America’, it is part of a special issue of New Political Economy on ‘Power relations in the digital economy‘. It is the result of joint work that I have done with members of the Diplab team (A.A. Casilli, M. Cornet, C. Le Ludec and J. Torres Cierpe) on the organisational and geographical forces underpinning the supply chains of artificial intelligence (AI). Where and how do AI producers recruit workers to perform data annotation and other essential, albeit lower-level supporting tasks to feed machine-learning algorithms? The literature reports a variety of organisational forms, but the reasons of these differences and the ways data work dovetails with local economies have remained for long under-researched. This article does precisely this, clarifying the structure and organisation of these supply chains, and highlighting their impacts on labour conditions and remunerations.

Framing AI as an instance of the outsourcing and offshoring trends already observed in other globalised industries, we conduct a global case study of the digitally enabled organisation of data work in France, Madagascar, and Venezuela. We show that the AI supply chains procure data work via a mix of arm’s length contracts through marketplace-like platforms, and of embedded firm-like structures that offer greater stability but less flexibility, with multiple intermediate arrangements that give different roles to platforms. Each solution suits specific types and purposes of data work in AI preparation, verification, and impersonation. While all forms reproduce well-known patterns of exclusion that harm externalised workers especially in the Global South, disadvantage manifests unevenly depending on the structure of the supply chains, with repercussions on remunerations, job security, and working conditions.

Marketplace- and firm-like platforms in the supply chains for data work in Europe, Africa, and Latin America. Dark grey countries: main case studies, light grey countries: comparison cases. Organisational modes range from almost totally marketplace oriented (darker rectangle, Venezuela) to almost entirely firm oriented (lighter rectangle, Madagascar). AI preparation (darker circle) is ubiquitous, but AI verification (darker triangle) and AI impersonation (darker star) tend to happen in ‘deep labour’ and firm-like organisations where embeddedness is higher.

We conclude that responses based only on worker reclassification, as attempted in some countries especially in the Global North, are insufficient. Rather, we advocate a policy mix at both national and supra-national levels, also including appropriate regulation of technology and innovation, and promotion of suitable strategies for economic development.

The Version of record is here, while here is an open access preprint.