This is awesome! There's a lot of interesting things that can be done with just the level of sequence generation AI we currently have, transforming the real data into a format for it is the most difficult part, I think.
This would make for an interesting derivation of the Turing test: can a player determine if a given level was created by a professional game designer or AI?
The authors define a level metric for the "linearity" of the level, defined as "how close the level can be fit to a line". This seems like a vague definition -- is it the optimal path taken by the player that is being fit to a line?
This similar project from a few months ago does Mario level generation and discusses the training data sequence creation in greater detail: https://medium.com/@ageitgey/machine-learning-is-fun-part-2-...