Tuesday, 9 September 2014

Big Data is a window on real world with digitization



From computerization to digitization
Data started with computerization by the 1950s and it grew up tremendously since 2006 when smart phones started to be in the hand of users


Computerization started by 1950s with process automation and number crunching. With these techniques, we send Apollo on the Moon. Data is an input/output



Networking started by 1970s with Arpanet, internet and the world wide web. It delivers communication and ubiquitous to every body. Data is an in-flow/out-flow


Virtual world started around the 1990s with computer simulation capabilities, building virtual products, virtual factories, prototype and tests before processing any physical results. Data is real/virtual


Digitization started with smart phones by 2006. Every person around the world is able to have a small powerful, communicating and mobile processor in his hand for every day life. Source of data from nearly 7 billions of users around the world

By now, digitization is expanding to household, medicine, control traffic, design, Internet of Things…

From data to Big Data 
With digitization, data become Big Data

Is there a difference between data and Big Data, beside volume, format or type?

                                 Yes, there is a major difference

Data are input/output, in-flow/outflow or real/virtual and data have to fit nicely for computerization to avoid garbage-in /garbage-out. Data can’t be garbage-in and it has to be structured in a way that the computer program can process. Data and computer models are linked. This is for example BI or OLAP

Big Data mirror reality and every day life, so you can’t have garbage-in (unless you think reality is garbage in, but this is another subject). Big Data reflect the world we live in so if we change anything about Big Data input, in order to have a smooth processing, then we change the reality, we are lying about truth and events (I know, it is tempting to change the world to drive to nice results but we shall follow Oscar Wilde recommendation… )

Processing Big Data with Turing machine : the difficult task
So if you can’t change Big Data, it is to the model to fit into data “as it is” for processing purpose. This is one of the challenges of Big Data: methodology for building processing capabilities on very large quantities of data

But today, processing relies on Turing machine and these machines ask for logic. The only way to manage today the logic is to build models, but models only reflect a part of reality, a model is a kind of abstract of reality, a model in with not with a single contradiction…

So the difficult task is: how to compute Big Data with logics and models, being aware that these last ones will never cover the total scope of Big Data ?

No answer today, at least on this blog but I am sure that with your comments, we will get some answers.

Big Data processing for life problem resolution
So before the computer builders deliver us some nice synapse, quantic or neuronal machines, lets start some practical approaches with our old and ugly looking Turing machine (the tower I have under my desk). Big Data bring the raw material to answer the questions, to solve problems and if possible, to act.

For example:

  • how much yogurts will I sell on next month in south East Europe ?
  • what will be the changing rate of $/£ on next term ?
  • what will be the budget of the next Olympic games?
  • When will we found a vaccine against paludism ?
  • How many years before the big crunch?
  • Is there any risks that our planet turns like Mars (no water, no life) ?
  • Is there life somewhere outside the solar system?
  • When will the deficits of some European countries decrease?


I am sure you are eager to have some responses and Big Data will help, but not yet. It could only be possible with your comments. We just have to believe it is possible. I do believe it.

Cheers for comments
Robert

No comments:

Post a Comment