I'm fairly sure the model is implemented in a reasonable way. It's an experimental deep generative model based on https://github.com/openai/glow, though more complex because the warp and its inverse are evaluated at training time, and the outputs fed to other things. The warp has around 200 layers, IIRC. The model requires keeping track of the evolution of the log-determinant of the warp after each operation, along with the derivatives of those things... so the graph can get pretty huge.