Speaker(s): David Hand (Imperial College London)
While the theory of actuarial and statistical methods has developed gradually over time from the mid-seventeenth century, it is only relatively recently that the computer has had its impact on the practical application of the ideas. Increasingly decisions affecting our lives are being based on formal decision-making processes coded within computers. Often these processes are very elaborate, using data no single human could hope to understand, or even manually examine. Sometimes the processes are adaptive, changing according to obscure internal mechanisms as new data become available. While clearly such developments hold huge promise for improving the human condition, they do not come without risks. In particular, the data may be of uncertain quality. The different dimensions of data quality are examined, categorised as bad data (not the data you want, but a distorted version), invisible data (not just the data you have, but also the data you’d like), changing data (not the data you’ve got, but the data you’ll have), alternative data (not the data you’ve got, but the data you would have had), and misleading data (not the data you’ve got, but the data you think you’ve got). Examples are given showing the potential adverse impact on our decisions, and our lives, and strategies for tackling the problems of poor data quality via detection, prevention, and correction are briefly explored.