What is the inline metadata trick?

gwern · on May 14, 2020

It's an old trick in generative models, I've been using it since 2015: https://www.gwern.net/RNN-metadata When you have categorical or other metadata, instead of trying to find some way to hardwire it into the NN by having a special one-hot vector or something, you simply inline it into the dataset itself, as a text prefix, and then let the model figure it out. If it's at all good, like a char-RNN, it'll learn what the metadata is and how to use it. So you get a very easy generic approach to encoding any metadata, which lets you extend it indefinitely without retraining from scratch (reusing models not trained with it in the first place, like OA's GPT-2-1.5b), while still controlling generation. Particularly with GPT-2, you see this used for (among others) Grover and CTRL, in addition to my own poetry/music/SubSim models.