Now I'd be interested to see GPT-3 trained on code samples from open-source repo...

nlh · on July 20, 2020

> Now I'd be interested to see GPT-3 trained on code samples from open-source repositories. Would it compile?

Check out:

https://www.tabnine.com/blog/deep

FTA:

"Deep TabNine is trained on around 2 million files from GitHub. During training, its goal is to predict each token given the tokens that come before it....Deep TabNine is based on GPT-2."

So this is GPT-2 not GPT-3, and it's designed to give line-by-line autocompletions, but I'm gathering that the way we're headed, the answer to your first question is approaching "yes"...

jrockway · on July 20, 2020

There was some good discussion about this on another GPT-3 this weekend, but I don't have the link handy.

The author prompted GPT-3 with some questions like, what is 10-1 (9), 100-1 (99), 1000-1 (999), 10000-1 (9099); i.e. after a while, it can't really "recurse" deeply enough to get the right answer anymore. The author also asked it some coding questions; it could answer something like "write a Ruby function to count the number of Xs in a word" but not "reverse the list [foo bar baz]" (not the exact examples, sorry). There again seems to be a point where it gets the idea, but can't compute deeply enough to actually answer this sort of question.

Edit: I found it! https://news.ycombinator.com/item?id=23887637

p1esk · on July 21, 2020

https://twitter.com/sharifshameem/status/1284095222939451393

I mean, this tweet is what started the latest round of GPT hype.

sktguha · on July 21, 2020

yes no doubt it is impressive. But some people are speculating that a lot of cherry picking is done for this demo. I have access to gpt-3 but I am unable to reproduce such results.

p1esk · on July 21, 2020

Classic tale :)