Speaker: Göran Kauermann
Big Data is certainly one of the buzzwords of the last five years. With the digital revolution nearly everything can today be measured, recorded or protocolled. The amount of data exceeds multiples of peta bytes with an annual increase, which has been unbelievable a couple of years ago. But the pure data flood is without big use if no information is drawn from the data. This step is the real challenge for the next decades. While some early views in the gold rush times of Big Data were even postulating that Big Data calls for the end of theory, since with enough data every question can be answered, it becomes more and more apparent that the step from Big Data to Big Information is full of obstacles and traps.
The talk lists some examples, where Big Data by itself lead to small or even invalid information if the data are not extended by other informative data and/or accompanied by theoretical approaches and data analytic models. In general, the question one poses to Big Data needs to be properly defined. And while Big Data can be big, they might not generally be good. The balance between quality of data and quantity of data is however of importance, though practitioners are often blindly impressed by the pure quantity without even questioning the quality of data. We demonstrate why quality is often more important than quantity using statistical means. The analysis of Big Data is challenging and bears tremendous opportunities at the same time. Novel statistical and computational tools are required, but also classical statistical approaches gain new impact.
The combination of the two disciplines statistics and computer science leads to the new field of data science. And while Big Data continue to grow, the real gold mine lies in the field of data science. And hence, theory and science is inevitable to get from Big Data to Big Information.