Archive for the ‘ Data & methods ’ Category

Science XXL: digital data and social science

I attended last week (unfortunately only part of) an interesting workshop on the effects of today’s abundance and diversity of digital data on social science practices, aptly called “Science XXL“. A variety of topics were discussed and different research experiences were shared, but I’ll just summarize here a few lessons learned that I find interesting.

  • Digital data are archive data. Data retrieved automatically from the digital traces of individual actions, such as those mined from the APIs of platforms such as Twitter, are unlike survey data in that they were not originally recorded for research purposes. The researcher must select relevant records on the basis of some understanding of the conditions under which these data were produced. Perhaps ironically, digital data share these characteristic with data from historical or literary archives.
  • Digital data are not necessarily “big”, in the sense that their volume is often small (at least in social science research so far!), even though they may share other characteristics of big data such as velocity (being generated on the fly as people use digital platforms) or variety (being little or not structured).
  • Digital data can help fill gaps in survey data, for example when survey sampling is not statistically representative: detail and volume can provide extra information that supports general conclusions.
  • Non-clean data, outliers and aberrant observations may be very informative, revealing details that would escape attention if researchers focused only on the average or center of the distribution (the normal law cherished in classical statistical approaches). Special cases are no longer a prerogative of qualitative research.
  • Data analysis is a key ingredient of “computational social science” a field that is growing in importance after an initial phase in which it was largely confined to agent-based simulation and complexity theory.

New: Paris Seminar on the Analysis of Social Processes and Structures (SPS)

Together with colleagues Gianluca Manzo, Etienne Ollion, Ivan Ermakoff, and Ivaylo Petev, I organize a new inter-institutional seminar series in sociology.

This new Social Processes and Structures (SPS) Seminar aims to take stock of the debates within the international scientific community that have repercussions for the practice of contemporary sociology, and that renew the ways in which we construct research designs, i.e., the ways in which we connect theoretical claims, data collection and methods to assess the link between data and theory. Several observations motivate this endeavor. Increasing interactions between social sciences and disciplines such as computer science, physics and biology outline new conceptual and methodological perspectives on social realities. The availability of massive data sets raises the question of the tools required to describe, visualize and model these data sets. Simulation techniques, experimental methods and counterfactual analyses modify our conceptions of causality. Crossing sociology’s disciplinary frontiers, network analysis expands its range of scales. In addition, the development of mixed methods redraws the distinction between qualitative and quantitative approaches. In light of these challenges, the SPS seminar discusses studies that, no matter their subject and disciplinary background, provide the opportunity to deepen our understanding of the relations between theory, data and methods in social sciences.

The inaugural session took place on 20 November 2016; the “regular” series starts this Friday, 27 January, and will continue until June, with one meeting per month.

All sessions take place at Maison de la Recherche, 28 rue Serpente, 75006 Paris, room D040, 5pm-7pm. All interested students and scholars are welcome, and there is no need to register in advance.

Continue reading

Special RFS issue on Big Data

Revue Française de Sociologie invites article proposals for a special issue on “Big Data, Societies and Social Sciences”, edited by Gilles Bastin (PACTE, Sciences Po Grenoble) and myself.

Focus is on two inextricably interwoven questions: how do big data transform society? How do big data affect social science practices?

Substantive as well as epistemological / methodological contributions are welcome. We are particularly interested in proposals that examine the social effects and/or the scientific implications of big data based on first-hand experience in the field.

The deadline for submission of extended abstracts is 28 February 2017; for full contributions, it is 15 September 2017. Revue Française de Sociologie accepts articles in French or English.

Further details and guidelines for submission are in the call for papers.

Data and theory: substitutes or complements? Lessons from history of economics

EEToday, my chapter on “Formalization and mathematical modelling” is published in a new series of three reference books on History of Economic Analysis (edited by G. Faccarello and H. Kurz, Edward Elgar). The chapter draws heavily on key ideas I developed as part of my thesis on the origins of mathematical economics. But this was a long time ago and reading it again today, I see it in a different light. I notice in particular that economics developed its distinctive mathematical flavour, which makes it neatly stand out relative to the other social sciences, at times in which social research was data-poor – and it did so not despite data paucity, but precisely because of it. William S. Jevons, a 19th-century forefather of the discipline who was clearly aware of the relevance of maths, wrote in 1871:

“The data are almost wholly deficient for the complete solution of any one problem”

yet:

“we have mathematical theory without the data requisite for precise calculation”

Continue reading

Visualization in mixed-methods research on social networks

The journal Sociological Research Online has just published (31 May 2016) a special section on “Visualization in Mixed-Methods Research on Social Networks”, guest edited by Alessio D’Angelo, Louise Ryan and myself.
FigureL1

Figure 1 – Tubaro, Ryan & D’Angelo

The five papers in this peer-reviewed special issue explore the potential of visual tools to accompany qualitative and mixed-methods research. Visualization can support data collection, analysis and presentation of results; it can be used for personal or complete networks; it can be paper-and-pencil or computer-based. Overall, visualization helps to jointly understand network contents and network structures.

The special issue is freely accessible from all commercial (non-academic) internet providers.

Continue reading

New publications on big data and official statistics

National Statistical Institutes (NSIs) have long been the recognised repositories of all socio-economic information, mandated by governments to collect and analyse data on their behalf. The development of big data is shaking this world. New actors are coming in and commercially-oriented, privately-produced information challenges the monopoly of NSIs. At the same time, NSIs themselves can tap into digital technologies and produce “big” data. More generally, these new sources offer a range of opportunities, challenges and risks to the work of NSIs.

OpendataThe Statistical Journal of the IAOS, the flagship journal of the International Association for Official Statistics, has published a special section on big data – of particular interest to the extent that it is free of charge!

Fride Eeg-Henriksen and Peter Hackl introduce this special section by defining big data and emphasising its interest for official statistics. But it is crucial,  albeit admittedly not easy, to separate the hype around big data from its actual importance.

The other papers are concrete examples of how big data may be integrated into official statistics:

Continue reading

The power of survey data: Eurostat Users’ Conference

survey3In the age of big data, social surveys haven’t lost their appeal and interest. Surveys are the instrument through which governments, for a long time, have gathered information on their population and economy to inform their choices. Interestingly, surveys conducted by, or for, governments are the best in terms of quality and coverage: because significant resources are invested in their design and realization, and especially because participation can be made compulsory by law (they are “official”), their sampling strategies are excellent and their response rates are extremely high. (Indeed, official government surveys are practically the only case in which the “random sampling” principles taught in theoretical statistics courses are actually applied). In short, these are the best “small data” available — and their qualities make them superior to many a (usually messy) big data collection. It is for this reason that surveys from official statistics have always been in high demand by social researchers.

Continue reading