Data and theory: substitutes or complements? Lessons from history of economics
Today, my chapter on “Formalization and mathematical modelling” is published in a new series of three reference books on History of Economic Analysis (edited by G. Faccarello and H. Kurz, Edward Elgar). The chapter draws heavily on key ideas I developed as part of my thesis on the origins of mathematical economics. But this was a long time ago and reading it again today, I see it in a different light. I notice in particular that economics developed its distinctive mathematical flavour, which makes it neatly stand out relative to the other social sciences, at times in which social research was data-poor – and it did so not despite data paucity, but precisely because of it. William S. Jevons, a 19th-century forefather of the discipline who was clearly aware of the relevance of maths, wrote in 1871:
“The data are almost wholly deficient for the complete solution of any one problem”
“we have mathematical theory without the data requisite for precise calculation”
What did he mean? He saw formalization as an approach to theory-building which, starting from the definition of theoretical constructs and of linkages between them, allows deriving logical conclusions from given premises. It does not even require its objects to be measurable or quantifiable, in so far as they can be expressed as variables or functions of variables, and they can be compared (larger/smaller, more/less, positive/negative, and so on). This idea is what, at the time, solved the paradox of utility, which had hindered the understanding of economic motivations beforehand: although we cannot measure utility (let alone compare the utilities of different people), we don’t need to do so to explain economic behaviour: we just need to know whether something brings more utility than something else, and we’ll be able to tell which choice an economic actor will be most likely to make.
This approach made possible the development of formalization in economics for a century and half – which is more than just quantification. It was not about attempting to measure social reality, but to understand its inner workings. It was about formulating hypotheses and ideas and build theories, prior to any confrontation with data. Lack of data was not a problem as long as theory provided support.
This is just the reverse of the “end of theory” approach that some thought was coming with the recent advent of big data. The research world of today is not data-poor, it’s flooded with data. And the data deluge, some said, would be enough to understand the world and make theory obsolete, in the social sciences just as in biology or physics. That’s just another way of saying what Jevons suggested long ago: theory and data are substitutes.
The data revolution is still in progress, so it’s difficult to say where it will be heading. But it’s already clear that the end of theory is not the only potential outcome. Some have started seeing data and theory as complements, not substitutes. In (not only social) science, the most significant progresses have been obtained when theory has been used to make sense of data. The hype today is on empirical validation and verification, testing (formal) theory against (quantitative) data according to some version of the scientific method. Economics has re-centred itself around the experimental approach. Causality is at the heart of econometrics. And causality matters more and more even in machine learning, where prediction is no longer seen as sufficient.
This is different from old-style formalization, and a different way to enrich our understanding of the social and economic world through mathematical tools. But it seems this is the way forward. Jevons, it seems, was not right after all.