Big data: Quantity or quality?

The very designation of “Big” Data suggests that size of datasets is the dividing line, distinguishing them from “Small” Data (the surveys and questionnaires traditionally used in social science and statistics). But is that all – or are there other, and perhaps more profound, differences?

Let’s start from a well-accepted, size-based definition. In its influential 2011 report, McKinsey Global Institute depicts Big Data as:

“datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze”.

Similarly, O’Reilly Media (2012) defines it as:

“data that exceeds the processing capacity of conventional database systems”.

The literature goes on discussing how to quantify this size, typically measured in terms of bytes. McKinsey estimates that:

“big data in many sectors today will range from a few dozen terabytes to multiple petabytes (thousands of terabytes)”

This is not set in stone, though, depending on both technological advances over time and specific industry characteristics.

Continue reading

Hallo world – a new blog is now live!

Hallo Data-analyst, Data-user, Data-producer or Data-curious — whatever your role, if you have the slightest interest in data, you’re welcome to this blog!

This is the first post and as is customary, it needs to tell what the whole blog is about. Well, data. Of course! But it aims to do so in an innovative, and hopefully useful, way.

DataBigAndSmall2

Continue reading