As an econometrician I cannot believe how many times he said 'magic'. There is something very wrong when you put things in your model 'because, who knows, it might be helpful' (like he did with host names). Variable selection is a very hard problem and using 'magic' is asking for problems. It is so disappointing to see machine learning, statistics, econometrics deal with similar problems and fail to learn from one another.
I understand this is a toy project, but he is put in a position in which he educates people how to use these methods and gives the wrong impression. The next guy might use this flawed logic while creating a tool for disease prevalence prediction.
To be fair he did explain that he thought that the host name might be indicative of whether or not it would be druck. If he knew exactly how that was the case (and if it was already known that it did have effect), why bother with machine learning? Just write an explicit scoring mechanism.
It did seem to me that this was an interpretation that he came up with after he tired many different pipelines and "flipped all the switches". There are many sources of randomness that warrant using statistical methods. But it feels strange to me to see people use these tools without giving much thought to parameter stability, parameter significance, causality, model selection in general.
That's exactly the recent criticism of 'big data': engineers and others getting correlations they don't understand from all the data they can collect, and attempting to use them for who knows what.
The presentation did a wonderful job of providing a high level introduction into the idea of machine learning but anyone that's strongly interested in ML should pick up some of the books he mentioned.