Something I don't understand is how to go about "prompting" GPT-3...
It's basically one big auto-complete, yes? It's a function that approximates the most likely completion for the input text, right?
So if you directly "ask" it a question, like "What's a great cookie recipe?" Wouldn't the "completion" for that query be from the perspective of the person asking?
Wouldn't it be better to prompt:
"Here is the greatest cookie recipe I've ever seen:" and then letting it complete on that?
I gave GPT-3 your prompt. It seemed to focus a bit more on enthusiasm about the recipe than about literally quoting the recipe.
> Here is the greatest cookie recipe I've ever seen:
> Here is the greatest cookie recipe I've ever seen: Quick Chocolate Peanut Butter Chip Cookies from the Minimalist Baker.
> I've made it four times in the last two weeks. It only takes 12 minutes to whip up and yields the most amazing cookies. These are soft, chewy, gooey, delicious, thick, thin, crunchy and mega chocolatey.
> It was hard to stop eating the cookies while I took the photos (even though I was on a 20-minute break from the photo shoot of my new ebook cover).
> Here is what the recipe looks like when printed out:
> Like all recipes from the Minimalist Baker, it's so simple, it's insane. No need for complicated gluten-free flours, unusual mixes of ingredients or 7-step recipes. Just:
> Mix flour, sugar, baking soda, baking powder and salt
> Mix in chopped chocolate and peanut butter
> Add melted butter and milk
> Stir until combined
> Drop onto cookie sheet
> This recipe calls for 3/4 cup of peanut butter. I used Trader Joe's creamy peanut butter, and that's how it looks:
This confirms something I've noticed about GPT-3, or at least GPT-3 as it is trained using the public internet as a corpus...
This response reads exactly like a so-called "recipes" website, in which the writer gives their whole life story, side-notes, and wanders around over several paragraphs before finally getting to the damn recipe.
This makes me think the public internet is not the most sanitary input for training. That type of "recipe" evolved, IMO, to snatch the highest SEO rankings, adding a bunch of keywords, snippets, affiliate links, etc, instead of just giving me the text of the recipe. And now GPT-3 has learned the same SEO tricks (at least when you give it my input, which is a very click-baity opening, to be fair...)
I tried a new prompt with the name of the recipe from the previous output and the word "Ingredients:".
> Quick Chocolate Peanut Butter Chip Cookies from the Minimalist Baker.
> Ingredients:
> 2 cups almond flour
> 1/4 cup cocoa powder
> 1/2 teaspoon baking soda
> 1/4 teaspoon fine sea salt
> 1/2 cup natural peanut butter
> 3/4 cup agave nectar (or brown rice syrup or coconut nectar)
> 1 teaspoon vanilla extract
> 1 tablespoon applesauce
> 1 cup chocolate chips
> Directions:
> Preheat oven to 350 degrees F and line a baking sheet with parchment paper. Mix together almond flour, cocoa powder, baking soda and salt in a large bowl. In a small bowl or measuring cup, mix peanut butter, agave nectar, vanilla, and applesauce until combined. Add wet ingredients to the dry ingredients and stir until just combined. Add chocolate chips and stir gently until just combined. Use a 1-2 tablespoon cookie scoop to scoop cookies onto the lined baking sheet. (The cookies will not spread, so it doesn’t matter how closely you space them.) Bake for 10-12 minutes.
> Time:
> 30 minutes to prep and cook
Very good at producing something that credibly looks like a baking recipe. I'm not sure whether it would be good, but maybe I should try to bake it.
It's amusing that the first completion suggested that the recipe would have only commonly-used baking ingredients, while the second recipe notably uses almond flour instead of wheat flour, and agave nectar instead of sugar. (I'm actually now quite curious to actually bake these GPT-3-devised cookies and see how they come out... although I think I might use wheat flour instead of almond flour if I do!)
Yes, the most likely completion, but not necessarily from the same writer. This is learned from any text it has seen somewhere, which includes text in various formats, like interviews. So when you have a question, most likely the next sentence might be the answer (from another person). Or to make that more explicit, just put a prefix like "Q1? A1. Q2? A2. Q3?" And "Q3" is your question. And now let it auto-complete. Most likely the auto-completion is "A3".
Now I'd be interested to see GPT-3 trained on code samples from open-source repositories. Would it compile?
Could it output a decent implementation of an algorithm if you were to feed it the comment describing it? How about more general statements about input and output.
The holy Grail would be to code just by describing what you expect the code to do. A few plain-language (maybe a more structured subset?) sentences stitched together, the API glue autocompleted.
And for reverse-engineering? Train it on drivers, then feed it packet captures. Could it make sense of the data?
"Deep TabNine is trained on around 2 million files from GitHub. During training, its goal is to predict each token given the tokens that come before it....Deep TabNine is based on GPT-2."
So this is GPT-2 not GPT-3, and it's designed to give line-by-line autocompletions, but I'm gathering that the way we're headed, the answer to your first question is approaching "yes"...
There was some good discussion about this on another GPT-3 this weekend, but I don't have the link handy.
The author prompted GPT-3 with some questions like, what is 10-1 (9), 100-1 (99), 1000-1 (999), 10000-1 (9099); i.e. after a while, it can't really "recurse" deeply enough to get the right answer anymore. The author also asked it some coding questions; it could answer something like "write a Ruby function to count the number of Xs in a word" but not "reverse the list [foo bar baz]" (not the exact examples, sorry). There again seems to be a point where it gets the idea, but can't compute deeply enough to actually answer this sort of question.
yes no doubt it is impressive. But some people are speculating that a lot of cherry picking is done for this demo. I have access to gpt-3 but I am unable to reproduce such results.
Yep, seems correct, and this is why it's been exciting to see these demonstrations from an AI research stand point - it was pretty unexpected you could 'coach' a giant auto complete with prompting so powerfully.
In short, yes. I think of it as plinko with a hundred billion pegs...the output is extremely sensitive to the starting conditions (including things like whitespace).
Given we really don't know how it's all working, it's safe to say that prompt composition is an artform that one can develop a skill at.
I was on a discord server with some other folks and one of them had access to the GPT-3 api. We provided a prompt similar to the following:
This person is a Canadian singer, songwriter, and actor. Discovered at 13 years
old by talent manager Scooter Braun after he had watched his YouTube cover song
videos, they were signed to RBMG Records in 2008. With this person's debut EP My
World, released in late 2009, they became the first artist to have seven songs from
a debut record chart on the Billboard Hot 100.
Q: What is this person's name?
A:
At first pass it gave some weird answer. So we deleted the answer and tried again. The second time it wrote 'Justin Drew Bieber'.
That chunk of biographical info is from his Wikipedia page, but it's really kind of remarkable to me that it was able to really even understand the question, let alone mine its relationships and cough up the right answer.
That sounds well within the capabilities of IBM's Jeopardy bot, Watson. That was demoed 10 years ago and, while initially impressive, seemed to be mostly cheap tricks even then.
I typed your text directly into Google, and it immediately gave a context box with the correct answer.
I even typed in: "person who sings EP MyWorld" and got the same answer. Given how much data there is about him (the infamous Twitter scaling problems), I don't think it's necessarily a good test. There's gonna be tons of relations in the corpus for him.
The default sampling parameters are really not that hot for cases where there is nearly a unique correct answer-- bad luck will force to to ask wrong.
From that prompt I (a human) don't know if I should expect something like "Albert Einstein" or the straight answer, so higher temperature sampling can risk kicking it down a silly path.
I put the same prompt into the question-answering preset and it immediately answered
A: Justin Bieber.
So that suggests that the prompt and temperature (etc.) are relevant to getting the kind of behavior you want. But it didn't seem to have any trouble getting this right the first time with the appropriate preset.
It's basically one big auto-complete, yes? It's a function that approximates the most likely completion for the input text, right?
So if you directly "ask" it a question, like "What's a great cookie recipe?" Wouldn't the "completion" for that query be from the perspective of the person asking?
Wouldn't it be better to prompt:
"Here is the greatest cookie recipe I've ever seen:" and then letting it complete on that?