Posts Tagged ‘ Big data ’

New ANR Project HUSH: Human supply chain behind smart technologies

Together with sociologist Antonio A. Casilli and economist Ulrich Laitenberger, I have recently received ANR (French National Research Agency) funding for a new study of human inputs – mostly platform-mediated work in the production of artificial intelligence solutions. In our project called HUSH (Human supply chain behind smart technologies) we aim to shed light on the whole ecosystem linking platforms, workers and their clients demanding data-related and algorithmic services.

For this project, we are now looking for a

PhD researcher in digital economics

The position provides the opportunity to focus strongly on research, in a very active environment. The team has collaborations with different online platforms and has collected data sets from the web, which can be used by the applicant for their thesis. The focus of the current position is to work on the economic aspects of platform-mediated work, using quantitative analyses. Two other PhD students (in sociology) have already been recruited for this project and work on related topics.

The starting date is January 2020 (a later starting date is also possible). As per national regulations, the annual stipend will be about 1,600 euros per month, with possibility to obtain a complement for extra activities such as teaching. Social security and professional training are provided. Additional funding is available to present your research at international conferences and workshops. The position will be based at the new campus of Telecom Paris in Palaiseau, in the direct neighborhood of École Polytechnique and ENSAE.

Your profile

Applicants should have successfully completed a Master’s degree in economics, socio/economic data science or related disciplines, or expect completion at the beginning of the year 2020. They should have a strong interest in digital platforms, from the perspective of industrial organization or labor economics, and have an empirical focus (econometrics, data science). They should aim at developing programming skills and have an interest in the evaluation of internet data. Fluency in English is required; knowledge of French is advantageous, but not essential.

Telecom Paris and IP Paris

Telecom Paris is part of the newly founded Institute Polytechnique (IP) Paris, together with Ecole Polytechnique, ENSTA, ENSAE and Telecom Sud. The department of social sciences and economics (SES) at Telecom Paris studies the impact of the digitization on economic activity and society. For more information, please see https://www.telecom-paris.fr/fr/lecole/departements-enseignement-recherche/sciences-economiques-sociales/structure/economie-gestion

How to apply

Please submit a cover letter, a curriculum vitae, a transcript of records (listing all subjects taken and their grades), and contact details of one to two referees by November 15, 2019 to Ulrich Laitenberger ( laitenberger@enst.fr ).

Le moment big data des sciences sociales: quel accès aux données du web et des médias sociaux ?

Table ronde, Sciences Po Paris, 6 décembre 2018, 18h00

RFS2018

Pour que la recherche en sciences sociales puisse pleinement tirer profit des grandes bases de données numériques, un verrou reste à lever : l’accès à ces données est limité, inégalement distribué, et entouré d’un flou juridique et déontologique. Nous proposons d’en discuter à l’occasion de la parution du numéro spécial de la Revue Française de Sociologie sur “Big data, sociétés et sciences sociales” (n. 59/3). Cette table ronde réunit les chercheur.e.s avec d’autres parties prenantes publiques et
privées.

Avec :

  • Garance Lefèvre, Policy senior associate, Uber
  • Roxane Silberman, Conseillère scientifique, Centre d’Accès Sécurisé aux Données (CASD)
  • Sophie Vulliet-Tavernier, Directrice des relations avec les publics et la recherche, Commission Nationale de l’Informatique et des Libertés (CNIL)
  • Les auteurs du numéro spécial.

Modérateurs : Gilles Bastin (Univ. Grenoble Alpes) et Paola Tubaro (CNRS), coordinateurs du numéro spécial.

Entrée libre et gratuite, dans la limite des places disponibles: pour s’inscrire, cliquez ici.

Accès : Sciences Po, salle Goguel. Entrée par le 27 rue Saint-Guillaume, 75007 Paris (traverser le jardin et prendre l’ascenseur jusqu’au dernier étage). La table ronde est organisée par la Revue Française de Sociologie en collaboration avec les Presses de Sciences Po. Elle sera suivie d’un pot.

 

Big data, societies and social sciences

Just published: Big data, societies and social sciences, a special issue of Revue Française de Sociologie, guest-edited by Gilles Bastin and myself.

Read a pre-print of our Introduction here.

English versions will be available soon.

More than complex: large and rich network structures

I co-organize this Satellite to the NETSCI2018 Conference in Paris, 12 June 2018. We are now accepting submissions of proposals for presentations.

Information on the Satellite

In traditional research paradigms, sociology handles small but rich networks where the richness of network attributes is derived from the specific buildup of the data collection process. In the sociological approach, differences among nodes and edges are key to describe network properties and the ensuing dynamical social processes. Instead, the complex systems tradition deals with large but poor networks. Assuming statistical equivalence of graph entities, a mean field treatment serves to describe the aggregate properties of the network. Today’s network datasets contain an unprecedented quantity of relational information at all, and between all, the possible levels: individuals, social groups, political structures, economical actors, etc. We finally deal with large and rich network structures that expose the implicit limitations of the two abovementioned approaches: the traditional methods from social science cannot be upscaled because of their algorithmic complexity and those from complex systems lose track of the complex nature of the actors, their relationships and their processes. This workshop has the aim of developing an interdisciplinary reflection on how methods from social science could be upscaled to large network structures and on how methods from complex systems could be downscaled to deal with small heterogeneous structures.

We are proud that five prominent international scholars are our invited speakers: Camille Roth, SciencesPo Paris; Matthieu Latapy, LIP6UPMC Paris; Alessandro Lomi, ETH Zurich; Fariba Karimi, GESIS Cologne; Noshir Contractor, Northwestern University.

Contributions

We invite abstracts of published or unpublished work for contributed talks to take place at the satellite symposium. We expect a broad range of topics to be covered, across theory, methodology, and application to empirical data, relating to an interdisciplinary reflection on how methods from social science could be upscaled to large network structures and on how methods from complex systems could be downscaled to deal with small heterogeneous structures.

Submission can be made through our website.

Submissions are required to be at most 650 words long and should include the following information: title of the talk, author(s), affiliation(s), email address(es), name of the presenter, abstract. Papers or submissions longer than 1 page will not be accepted.

Important dates

Abstract submission deadline is March 25, 2018. Notification of acceptance will be no later than April 23, 2018.

All participants and accepted speakers will have to register through the NETSCI2018 website.

Open Data: What’s new in 2017?

I am now in Montréal, where I participated, last Friday, in a panel on Open Data at “Science & You” international conference. It was interesting for me to reflect on how the picture has changed since my previous panel on the same topic – in Kiev in 2012. Back then, we were busy trying to convince public administrations that data opening was good for transparency and could help improve services to communities. Since then, a lot of attempts have been made in numerous countries – local authorities often pioneering the process, followed only later by central governments (one example cited in my panel was Québec City). What is made open is typically information from public registers (first names of newborns, records of road accidents) and increasingly, from technological devices and sensors (bus traffic information).

There are some conditions to be met for a dataset to be said “open”:

  • Technically, it needs to be “raw”, detailed, digital and reusable. The French Interior Ministry released results of the first round of the recent presidential elections within a few days, at polling station level. This is sufficiently detailed (with over 69,000 polling stations throughout the country), raw (allowing aggregations, comparisons etc.), and digital/reusable (so much so that the newspaper Le Monde could develop a user-friendly application to let readers easily check results in their neighborhoods). Some would also insist that “open” data should be released in non-proprietary formats (better .csv than .xls, for example).
  • Legally, the data must come with a license that allows re-use by third parties (typically within the Creative Commons family). Ideally, no type of reuse should be ruled out (including somewhat controversially, commercial / for-profit reuse).
  • Economically, the data should be available to all for free (or at least with minimal charges if data preparation requires extra work or expenses).

If in the past few years, a lot of thought has been devoted to the “ideal” conditions for data opening and how this would positively affect public service, the data landscape has now significantly changed.

Continue reading

Science XXL: digital data and social science

I attended last week (unfortunately only part of) an interesting workshop on the effects of today’s abundance and diversity of digital data on social science practices, aptly called “Science XXL“. A variety of topics were discussed and different research experiences were shared, but I’ll just summarize here a few lessons learned that I find interesting.

  • Digital data are archive data. Data retrieved automatically from the digital traces of individual actions, such as those mined from the APIs of platforms such as Twitter, are unlike survey data in that they were not originally recorded for research purposes. The researcher must select relevant records on the basis of some understanding of the conditions under which these data were produced. Perhaps ironically, digital data share these characteristic with data from historical or literary archives.
  • Digital data are not necessarily “big”, in the sense that their volume is often small (at least in social science research so far!), even though they may share other characteristics of big data such as velocity (being generated on the fly as people use digital platforms) or variety (being little or not structured).
  • Digital data can help fill gaps in survey data, for example when survey sampling is not statistically representative: detail and volume can provide extra information that supports general conclusions.
  • Non-clean data, outliers and aberrant observations may be very informative, revealing details that would escape attention if researchers focused only on the average or center of the distribution (the normal law cherished in classical statistical approaches). Special cases are no longer a prerogative of qualitative research.
  • Data analysis is a key ingredient of “computational social science” a field that is growing in importance after an initial phase in which it was largely confined to agent-based simulation and complexity theory.