I looked at it, and it is very good: good baselines, explanations, visualizations and going deeper than a typical "it is a black box but if you copy & paste it will work".
Jane Street is noted for its use of OCaml, so it's interesting to see that their researchers do indeed use Python (judging from the code in that post, at least).
There is a project that aims to bring NN/ML/DL/RL (along with other scientific computations) in the OCaml world - Owl[1]. They also have a list[2] of the potentially interesting ideas for new contributors to take.
This is a popularization of things already published in open papers, so it does not reveal anything specific about their activities. Any place employing deep ML practitioners could have written this.
It could even be a red herring, as the most popular application of batch norm is to Deep CNNs, and those are mostly used on computer vision problems. CV does not seem important for option pricing, which is AFAIK Jane Street's big money maker. Of course I can be very wrong about this. People have tried image data as auxiliary inputs to financial data. Or you can apply Deep CNNs to 1D data like timeseries - see WaveNet applied to timeseries forecasting.
I wasn't familiar with batch normalization before, but I've had to do something similar before in Stan to enforce that some model parameters (not data) were exactly mean of 0 and standard deviation of 1.
Wow. You straight up copy-pasted the top reddit comment on this article from 5 months ago [0]. Funny thing is that that the article mentions making corrections due to that comment (also 5 months ago) so your stolen comment isn't even relevant anymore.
https://twitter.com/dcpage3/status/1141700299071066112
Disclaimer: I work at Myrtle!