ML models are often referred to as "black boxes", especially so if they are derived from Neural Net-type approaches. But truth be told, many approaches are powerful and do indeed outperform "classical" statistical inference. Consequently, transparency and comprehension are justifiably called upon.
But not last due to the great potential of new ML techniques, there is a tendency to "trusting the algorithm" and (consequently) ignoring the vital importance of curated input data in order to get out what is expected, whereas understanding and shaping the input features is the most important part of the modeling - because that's where one can actually influence the performance the most. In this web session, we will take a closer look at semantically formatting and generating the input space of Machine Learning tools.
As most Actuaries are well-versed in the treatment of highly structured data sets, this web session will look at unstructured data such as time-series (from e.g. Telematics) and fragmented contextuals (like social networks).