To me, the article somewhat misses the point of what's interesting here. Using A...

orange3xchicken · on May 26, 2020

It looks like they go from tree -> sequence via prefix notation. I'm curious why Lample decided on this seq2seq approach when it seems that there might be models which could be more naturally applied to the tree structure directly [1, 2]

[1] https://arxiv.org/abs/1902.07282 (an AST translation system)

[2] https://arxiv.org/abs/1609.02907 (GCNN)

shmageggy · on May 26, 2020

For the same reason people use huge transformers for sentences in natural language (which are also tree structured): they scale really well. If you have enough data, huge transformers have huge capacity. If you notice, this paper is entirely about how to cleverly generate a massive dataset. There is no novelty in the model -- they just use a standard approach described in two paragraphs.

n-m-j · on May 26, 2020

From your linked article, it seems they just use the trees to generate prefix notation.