Big Data

Book Review Big Data Review by Madhuri Lele

BIG DATA : A Revolution That Will Transform How We Live, Work and Think
by Victor Mayer – Schonberger and Kenneth Cukier

If the title and tag line is not captivating enough for a reader to pick up the ‘BIG DATA’ summing by Wall Street Journal will facilitate it for sure: ‘No other book offers such an accessible and balanced tour of the many benefits and downsides of our continuing infatuation with data’.
Monosyllabic by Observer ‘Fascinating’ and Financial Times ‘An excellent primer’ will make the decision of picking it up a lot easier too!

A ground breaking and fascinating book by two of world’s most respected big data experts that objectively propel the benefits and even the darker side of big data. Right in the beginning the authors explicitly affirm “In this book we are not so much big data’s evangelists, but merely messengers”. The book reveals the reality of a new world by providing actionable steps to equip the reader with the tools needed to harness big data, not be crushed by it.

Early in the book authors share about the danger to shifts from privacy to probability for individuals as algorithms will predict the likelihood of getting heart attack (and pay more for insurance), default on mortgage (and be denied a loan), or predictions in effect punish people for their propensities, not their actions – commit a crime (and perhaps get arrested in advance). This will lead to an ethical consideration of the role of free will versus the dictatorship of data and if so what are the implications of this for human freedom and dignity? Most strikingly, it brings out how the society will need to shed some of its obsession for causality in exchange for simple correlation: not knowing why but only what.

The book takes a reader through details such as an explanation on the word “data”: The word “data” means “given” in Latin., in the sense of a “Fact”. Earlier it became title of a classic work by Euclid on geometry in which he explains geometry from what is known or can be shown to be known. What it refers to today: description of something that allows it to be recorded, analyzed, and reorganized.
As one moves ahead there is information on difference between datafication and digitization and includes its evolution.
Commander Murray, the “Pathfinder of Seas” was among the first to realize a core tenet of big data that could be extracted and tabulated turning it into eminently useful data.

Likewise Professor Koshimizu took something such as ‘the way a person sits’ as ‘data’, something that had never been treated as data – or even imagined to have an informational quality and transformed it into numerically quantified format. Few would think that the way a person sits constitutes information, but it can. When a person is seated, the contours of the body, posture and distribution of weight can all be quantified and tabulated. With some additional measurements and indexing can result in a digital code that is unique for each individual which can be used to prevent car thefts, accident details and so on.

In the absence of a good term for such transformations produced by Commander Murray and Professor Koshimizu, the authors describe it as datafication followed by clarification that to datafy a phenomenon is to put it in quantified format so it can be tabulated and analyzed.

Digitization is very different from datafication. Digitization is the process of converting analog information into the zeros and ones of binary code so computers can handle it and was practiced for long, taking analog content and digitizing it took place much later. Published in 1995 landmark book Being Digital of MIT Media Lab by Nicholas Negroponte covered shift from atoms to bits as one of his big themes. In short, digitization turbochanges datafication but it is not a substitute. The act of digitization – turning analog information into computer readable format – by itself does not datafy.

The classic example is of Google when it announced its bold plan in 2004. First there was digitized text by scanning in a high-resolution digital image file stored on Google servers. In doing this the text hadn’t been datafied but had images that only humans could transform into useful information by reading. There are many companies who are now doing this and one of course is Amazon.
So the book captures the journey where the words, location and interactions become data, eventually turning datafication of everything.

The examples and case studies share how big-data can impact life – the benefits. The book covers many but to showcase a few: UPS uses “geo-loco” data in multiple ways– shaved 30 million miles off its drivers routes, saved three million gallons of fuel and 30,000 metric tons of carbon-dioxide emissions. Fitbit, Jawbone, Basis and Apple enable human beings to learn from datafying how one’s body works. iTem uses the phone’s built-in accelerometer to monitor Parkinson and other neurological disorders.

One gets to have details on the big-data value chain that comprises of data holders (VISA and MasterCard), data specialists (Accenture, Microsoft Research) and those companies with a big-data mindset (FlightCaster, Amazon, Google).

And we also have an interesting example how the companies are using the data more strategically : Rolls-Royce sells the engines and also offers to monitor – services count for 70 percent of respective division’s annual revenue!
Moving on risks we get to know the darker side. It brings across the reality that we are entering a world of constant data-driven predictions where we may not be able to explain the reasons behind our decisions and this would impact our life tremendously. The compelling point continues to cover the most significant portend “In an era of big-data there will be a special need to carve out a place for human: to reserve space for intuition, common sense and serendipity to ensure that they are not crowded out by data and machine-made answers.” The premise is substantiated, “What is greatest about human beings is precisely what the algorithms and silicon chips don’t reveal, what they can’t reveal because it can’t be captured in data. It is not the “what is”, but the “what is not”: the empty space, the cracks in the sidewalk, the unspoken and the not-yet-thought.”

Much will depend upon the Algorithmists, their vow for impartiality and confidentiality. This would enable avoiding of instances of ‘no-fly’ flawed list which included Senator Kennedy and Googles’ ‘autocomplete’ feature which defamed individuals in Japan, France, Germany and Italy.

As we come close to end a serious note – Big data is a resource and a tool. It is meant to inform, rather than explain, we must never let its seductive glimmer blind us to its inherent imperfections.


Leave A Comment