2024 in review

My great regret is that I always have very little time to write posts, and the emptiness of this blog does not reflect the numerous, great and stimulating scientific events and opportunities that I have enjoyed throughout 2024. As a last-minute remedy (with a promise to do better next year…hopefully), I try to summarize the landmarks here, month by month.

In January, I launched the Voices from Online Labour (VOLI) project, which I coordinate with a grant of about €570,000 from the French National Agency for Research. This four-year initiative brings together expertise from sociology, linguistics, and AI technology across multiple institutions, including four French research centres, a speech technology company, and three international partners.

In February with the Diplab team, I spent two exciting days at the European Parliament in Brussels, engaging in profound discussions with and about platform workers as part of the 4th edition of the Transnational Forum on Alternatives to Uberization. I chaired a panel with data workers and content moderators from Europe and beyond, aiming to raise awareness about the difficult working conditions of those who fuel artificial intelligence and ensure safe participation to social media.

In March, three publications saw the light. One is a solo-authored chapter, in French, on ‘Algorithmes, inégalités, et les humains dans la boucle‘ (Algorithms, inequalities, and the humans in the loop) in a collective book entitled ‘Ce qui échappe à l’intelligence artificielle‘ (What AI cannot do). The other two are journal articles that may seem a little less close to my ‘usual’ topics, but they are important because they constitute experiments in research-informed teaching. One is a study of the 15-minute city concept applied to Paris, realized in collaboration with a colleague, S. Berkemer of Ecole Polytechnique, and a team of brilliant ENSAE students. The other is an analysis of the penetration of AI into a specific field of research, neuroscience, showing that for all its alleged potential, it created a confined subfield but did not entirely disrupt the discipline. The study, part of a larger project on AI in science, was part of the PhD research of S. Fontaine (who has now got his degree!), also co-authored with his co-supervisors F. Gargiulo and M. Dubois.

In April, I co-published the final report from the study realized for the European Parliament, ‘Who Trains the Data for European Artificial Intelligence?‘. Despite massive offshoring of data tasks to lower-income countries in the Global South, we find that there are still data workers in Europe. They often live in countries where standard labour markets are weaker, like Portugal, Italy and Spain; in more dynamic countries like Germany and France, they are often immigrants. They do data work because they lack sufficiently good alternative opportunities, although most of them are young and highly educated.

I then attended two very relevant events. On 30 April-1 May, I was at a Workshop on Driving Adoption of Worker-Centric Data Enrichment Guidelines and Principles, organised by Partnership on AI (PAI) and Fairwork in New York city to bring together representatives of AI companies, data vendors and platforms, and researchers. The goal was to discuss options to improve working conditions from the side of the employers and intermediaries. On 28 May, I was in Cairo, Egypt, to attend the very first conference of the Middle East and Africa chapter of INDL (International Network on Digital Labour), the research network I co-founded. It was a fantastic opportunity to start opening the network to countries that were less present before, and whose voices we would like to hear more.

June also provided exciting opportunities, with a workshop on ‘The Political Economy of Green-Digital Transition‘ at LUT University in Lappeenranta, Finland.

In July, the final version of our article on ‘Who bears the burden of a pandemic? COVID-19 and the transfer of risk to digital platform workers‘ came out in American Behavioral Scientist.

August is a quieter month (but I greatly enjoyed a session at the Paralympics in Paris!), so I’ll jump to September. Lots of activities: a trip to Cambridge, UK, and a workshop on disinformation at the Minderoo Centre for Technology and Democracy; a workshop on Invisible Labour at Copenhagen Business School in Denmark; and a one-day conference on gender in the platform economy in Paris. Another publication came out: a journal article, in Spanish, on Argentinean platform data workers.

More publications in October: a book chapter, in Portuguese, on ‘Fabricar os dados: o trabalho por trás da Inteligência Artificial‘, and a journal article, in French, on the ethics and methodology of using graph visualizations in fieldwork (an older topic to which I’m still attached – and which takes renewed importance with today’s fast renewal of research ethics!).

At the end of October, and until mid-November, I travelled to Chile for the seventh conference of the International Network on Digital Labour (INDL-7), which I co-organised. It was an immensely rewarding experience. I took the opportunity to strengthen my linkages and collaborations with colleagues there. It was a very intense, and super-exciting, time: after INDL-7 (28-30 October), I spent a week in Buenos Aires, Argentina, where I co-presented work in progress at the XV Jornadas de Estudios Sociales de la Economía, UNSAM. I then returned to Chile where I gave a keynote at the XI COES International Conference in Viña del Mar, Chile, on 8 November, and another at the ENEFA conference in Valdivia (Chile) on 14 November. I also gave a talk as part of the ChiSocNet series of seminars in Santiago, 11 November.

December was my return to teaching… and planning for the new year! Of note, I was interviewed for a Swiss podcast.

Cambridge

Today, I end my 3-month-and-half visit to Churchill College, University of Cambridge where I am a By-Fellow. It has been an amazingly enriching experience and I gratefully acknowledge financial support from the French Embassy in the United Kingdom. Colleges are special places where traditional elitism mixes with more modern tendencies toward openness and diversity. I think the great value of colleges rests in their deeply interdisciplinary culture – way beyond what one may find in university departments and research centres. In my short stay, I have had lots of mind-opening conversations with scholars from all domains (often while enjoying a nice meal together), always with the feeling that people listen and learn from each other rather than that sense of constant competition that I have often perceived when crossing disciplinary boundaries.


My by-fellowship would not have been possible without the support of Gina Neff and her colleagues at Minderoo Centre for Technology and Democracy who hosted me. They are doing extremely valuable work to rethink the social and environmental impact of technologies and to promote innovative and more sustainable ways forward. I was also honoured to collaborate with the team of Cambridge Digital Humanities, especially Anne Alexander who directs the Learning programme and invited me to give two sessions on social network analysis at the Social data School last June. Finally, I thank the director and the members of the CRASSH research centre (where Minderoo is based) who kindly welcomed me at their offices and gave me the opportunity to attend some of their recent events.

New year, new job, new life…

keep-calm-you-start-a-new-job-mondayYes I must admit it: I didn’t keep my new-year-2015 promise of posting more often on my blog… and the annual report I received yesterday from WordPress, showing a couple of peaks of activity and frigthening silence the rest of the year, isn’t something I would be proud to share… but I have a justification! Seriously, it’s not just an excuse – it’s that I’ve been busy trying to change life… and yes, I managed. On Monday 4 January, I’ll start an exciting new position as senior research scientist at the National Center of Scientific Research (CNRS, or in French, Centre national de la recherche scientifique) in Paris. CNRS can be loosely compared to what is, in other countries, a National Research Council, but there’s more to it than international comparisons might vaguely suggest: this is probably the single most desired job in French academia, with a mission “to contribute to the development of knowledge… in all fields that contribute to the advancement of society“. In plain words, that’s basically pure research with almost no teaching apart from some PhD supervision… a dream that would hardly be possible in the UK, where I was before.

I’ll be at the Lab for Computer Science (LRI, Laboratoire de Recherche en Informatique, UMR8623) on the Saclay Computer+sciencecampus, and I’ll work with the A&O (Learning and Optimization) research team. The interesting thing is that mine is an interdisciplinary position, designed to facilitate dialogue and collaboration between the social sciences and computer science around big data and their use for the advancement of knowledge, policy, and more generally society. I have been especially selected by the sociology section of CNRS to work in a computer science research centre. There, I am asked to develop my personal, long-term research project on the “sharing economy” of digital platforms and how they create value from the social ties in which economic action is embedded: this will require blending my research on data, social networks and the digital economy with machine learning and optimization approaches (more on this later … yes on this blog! promise!).

eusn2016What else will I do this year at LRI? I am in the organising committee of the Second European Social Networks Conference which will take place in Paris next June, I am finishing a book on so-called “pro-anorexia” websites as the conclusion of my past project ANAMIA, and I am in the Editorial Board of Revue Française de Sociologie.

I won’t entirely forget England though… I’ll keep my doctoral students at Greenwich and continue my engagement at UCL’s Institute of Education as external examiner. Come on, you can’t just disappear after six years! Indeed, I’ll always remember those six years as most productive and fulfilling ones. And however happy I am now to join CNRS, I’ll never forget the expressions of love, sympathy and friendliness I received from colleagues and students when I left Greenwich in December. The cards, the presents, the parties… all beyond any expectations I might have had before! Thank you Greenwich. And well, yes, a big thank you to all those who made it possible – both those in London who made me have a great time far from home for so long, and those in Paris who helped me come back, not without effort, and have welcomed me now.

A great new year is about to start, and I promise I’ll document it more… 😉

#bigdataBL

On Friday last week, the British Sociological Association (BSA) held an event on “The Challenge of Big Data” at the British Library. It was interesting, stimulating and relevant – I was particularly impressed by the involvement of participants and the very intense live-tweeting, never so lively at a BSA event! And people were particularly friendly and talkative both on their keyboards and at the coffee tables… so in honour of all this, I am choosing the hashtag of the day #bigdataBL as title here.

bigdataBL(Visualisation: http://www.digitalcoeliac.com/)

Some highlights:

  • The designation of “big data” is from industry, not (social) science, said a speaker at the very beginning. And it is known to be fuzzy. Yet it becomes a relevant object of scientific inquiry in that it is bound to affect society, democracy, the economy and, well, social science.
  • Big-data practices change people’s perception of data production and use. Ordinary people are now increasingly aware that a growing range of their actions and activities are being digitally recorded and stored. Data are now a recognized social object.
  • Big data needs to be understood in the context of new forms of value production.
  • So, social scientists need to take note (and this was the intended motivation of the whole event). The complication is that Big Data matter for social science in two different ways. First, they are an object of study in themselves – what are their implications for, say, inequalities, democratic participation, the distribution of wealth. Second, they offer new methods to be exploited to gain insight into a wide range of (traditional and new) social phenomena, such as consumer behaviours (think of Tesco supermarket sales data).
  • Put differently, if you want to understand the world as it is now, you need to understand how information is created, used and stored – that’s what the Big Data business is all about, both for social scientists and for industry actors.

Continue reading “#bigdataBL”

What is data?

All the hype today is about Data and Big Data, but this notion may seem a bit elusive. My students sometimes struggle understanding the difference between “data” and “literature”, perhaps because of the unfortunate habit to call library portals “databases”. Even colleagues are sometimes uncomfortable with the notion of data (whether “big” or “small”) and the breadth it is now taking. So, a definition can be helpful.

Data  are pieces of unprocessed information – more precisely raw indicators, or basic markers, from which information is to be extracted. Untreated, they hardly reveal anything; subject to proper analysis, they can disclose the inner working of some relevant aspects of reality.

The “typical” example of socioeconomic data is the observations/variables matrix, where each row represents an observation – an individual in a population – and each column represents a variable – a particular indicator about that individual, for example age, gender, or geographical location. (In truth data types are more varied and may also include unstructured text, images, audio and video; But for the sake of simplicity, let’s stick to the Matrix here.)

 Fig11a

Continue reading “What is data?”

Big data: Quantity or quality?

The very designation of “Big” Data suggests that size of datasets is the dividing line, distinguishing them from “Small” Data (the surveys and questionnaires traditionally used in social science and statistics). But is that all – or are there other, and perhaps more profound, differences?

Let’s start from a well-accepted, size-based definition. In its influential 2011 report, McKinsey Global Institute depicts Big Data as:

“datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze”.

Similarly, O’Reilly Media (2012) defines it as:

“data that exceeds the processing capacity of conventional database systems”.

The literature goes on discussing how to quantify this size, typically measured in terms of bytes. McKinsey estimates that:

“big data in many sectors today will range from a few dozen terabytes to multiple petabytes (thousands of terabytes)”

This is not set in stone, though, depending on both technological advances over time and specific industry characteristics.

Continue reading “Big data: Quantity or quality?”

Hallo world – a new blog is now live!

Hallo Data-analyst, Data-user, Data-producer or Data-curious — whatever your role, if you have the slightest interest in data, you’re welcome to this blog!

This is the first post and as is customary, it needs to tell what the whole blog is about. Well, data. Of course! But it aims to do so in an innovative, and hopefully useful, way.

DataBigAndSmall2

Continue reading “Hallo world – a new blog is now live!”