Now I'd be interested to see GPT-3 trained on code samples from open-source repositories. Would it compile?
Could it output a decent implementation of an algorithm if you were to feed it the comment describing it? How about more general statements about input and output.
The holy Grail would be to code just by describing what you expect the code to do. A few plain-language (maybe a more structured subset?) sentences stitched together, the API glue autocompleted.
And for reverse-engineering? Train it on drivers, then feed it packet captures. Could it make sense of the data?
"Deep TabNine is trained on around 2 million files from GitHub. During training, its goal is to predict each token given the tokens that come before it....Deep TabNine is based on GPT-2."
So this is GPT-2 not GPT-3, and it's designed to give line-by-line autocompletions, but I'm gathering that the way we're headed, the answer to your first question is approaching "yes"...
There was some good discussion about this on another GPT-3 this weekend, but I don't have the link handy.
The author prompted GPT-3 with some questions like, what is 10-1 (9), 100-1 (99), 1000-1 (999), 10000-1 (9099); i.e. after a while, it can't really "recurse" deeply enough to get the right answer anymore. The author also asked it some coding questions; it could answer something like "write a Ruby function to count the number of Xs in a word" but not "reverse the list [foo bar baz]" (not the exact examples, sorry). There again seems to be a point where it gets the idea, but can't compute deeply enough to actually answer this sort of question.
yes no doubt it is impressive. But some people are speculating that a lot of cherry picking is done for this demo. I have access to gpt-3 but I am unable to reproduce such results.
Could it output a decent implementation of an algorithm if you were to feed it the comment describing it? How about more general statements about input and output.
The holy Grail would be to code just by describing what you expect the code to do. A few plain-language (maybe a more structured subset?) sentences stitched together, the API glue autocompleted.
And for reverse-engineering? Train it on drivers, then feed it packet captures. Could it make sense of the data?