Human listeners and virtual assistants: privacy and labor arbitrage in the production of smart technologies

I’m glad to announce the publication of new research, as a chapter in the fabulous Digital Work in the Planetary Market, a volume edited by Mark Graham and Fabian Ferrari and published in open access by MIT Press.

The chapter, co-authored with Antonio A. Casilli, starts by recalling how In spring 2019, public outcry followed media revelations that major producers of voice assistants recruit human operators to transcribe and label users’ conversations. These high-profile cases uncovered the paradoxically labor-intensive nature of automation, ultimate cause of the highly criticized privacy violations.

The development of smart solutions requires large amounts of human work. Sub-contracted on demand through digital platforms and usually paid by piecework, myriad online “micro-workers” annotate, tag, and sort the data used to prepare and calibrate algorithms. These humans are also needed to check outputs – such as automated transcriptions of users’ conversations with their virtual assistant – and to make corrections if needed, sometimes in real time. The data that they process include personal information, of which voice is an example.

We show that the platform system exposes both consumers and micro-workers to high risks. Because producers of smart devices conceal the role of humans behind automation, users underestimate the degree to which their privacy is challenged. As a result, they might unwittingly let their virtual assistant capture children’s voices, friends’ names and addresses, or details of their intimate life. Conversely, the micro-workers who hear or transcribe this information face the moral challenge of taking the role of intruders, and bear the burden of maintaining confidentiality. Through outsourcing, platforms often leave them without sufficient safeguards and guidelines, and may even shift onto them the responsibility to protect the personal data they happen to handle.

Besides, micro-workers themselves release their personal data to platforms. The tasks they do include, for example, recording utterances for the needs of virtual assistants that need large sets of, say, ways to ask about the weather to “learn” to recognize such requests. Workers’ voices, identities and profiles are personal data that clients and platforms collect, store and re-use. With many actors in the loop, privacy safeguards are looser and transparency is harder to ensure. Lack of visibility, not to mention of collective organization, prevents workers from taking action.

Note: Description of one labor-intensive data supply chain. A producer of smart speakers located in the US outsources AI verification to a Chinese platform (1) that relies on a Japanese online service (2) and a Spanish sub-contractor (3) to recruit workers in France (4). Workers are supervised by an Italian company (5), and sign up to a microtask platform managed by the lead firm in the US (6). Source: Authors’ elaboration.

These issues become more severe when micro-tasks are subcontracted to countries where labor costs are low. Globalization enables international platforms to allocate tasks for European and North American clients to workers in Southeast Asia, Africa, and Latin America. This global labor arbitrage goes hand in hand with a global privacy one, as data are channeled to countries where privacy and data protection laws provide uneven levels of protection. Thus, we conclude that any solution must be dual – protecting workers to protect users.

The chapter is available in open access here.

Big data and the hypothesis of the end of privacy

In the late 2000s, voices suggesting that our societies might be nearing the ‘end of privacy’ became increasingly deafening. Our cultural, political and regulatory environment was on the verge of major transformation – so went the narrative. Businesses rejoiced as notoriously, less privacy and more information oils the economy.

In a video interview with Italian media Idee Sottosopra, I review the courses of action taken by various stakeholders, in particular Internet companies, and examine their conflicts and controversies. I show how the very concept of privacy, inherited from a long legal and judicial tradition, should be revised and redefined to appropriately describe today’s online interactions.

Overall, there is no deterministic and inevitable tendency to exclude privacy from our societies, but rather a tension between social forces for and against privacy, which has accompanied the advent of the digital economy and especially social media. The positions of stakeholders, especially users, are often ambiguous, and social media companies attempted to leverage this ambiguity to their own advantage.

Yet civil society reactions have been stronger and stronger, and after initial David-vs-Goliath attempts of individuals and small associations, more and more authoritative institutions have taken seriously the defence of privacy. We are no longer left to costly and little-visible individual choices, and especially after entry into force of GDPR in Europe, we have now an unprecedented opportunity to act at a more systemic level.

Big Data. L’ipotesi della fine della della Privacy | Società Digitale | Idee Sottosopra