No it doesn't. For one thing they're already at human performance. For another, ...

smeagull · on April 17, 2023

This tells me you haven't really stress tested the model. GPT is currently at the stage of "person who is at the meeting, but not really paying attention so you have to call them out". Once GPT is pushed, it scrambles and falls over for most applications. The failure modes range from contradicting itself, making up things for applications that shouldn't allow it, to ignoring prompts, to simply being unable to perform tasks at all.

dragonwriter · on April 17, 2023

Are we talking about bare GPT through the UI, or GPT with a framework giving it access to external systems and the ability to store and retrieve data?

Because, yeah, “brain in a jar” GPT isn’t enough for most tasks beyond parlor-trick chat, but being used as a brain in a jar isn’t the point.

smeagull · on April 23, 2023

We have given it extensions, and really the extensions do a lot of the work. The tool that judges the style and correctness of the text based on the embedding is doing much of the heavy lifting. GPT essentially handles generating text and dense representations of the text.

moffkalast · on April 17, 2023

Still waiting to see those plugins rolled out and actual vector DB integration with GPT 4, then we'll see what it can really do. Seems like the more context you give it the better it does, but the current UI really makes it hard to provide that.

Plus the recursive self prompting to improve accuracy.

quonn · on April 17, 2023

How are they at human performance? Almost everything GPT has read on the internet didn‘t even exist 200 years ago and was invented by humans. Heck, even most of the programming it does wasn‘t there 20 years ago.

Not every programmer starting from scratch would be brilliant, but many were self taught with very limited resources in the 80s form example and discovered new things from there.

GPT cannot do this and is very far from being able to.

og_kalu · on April 17, 2023

>How are they at human performance?

Because it performs at least average human level (mostly well above average) on basically every task it's given.

"Invest something new" is a nonsensical benchmark for human level intelligence. The vast majority of people have never and will never invent anything new.

If your general intelligence test can't be passed by a good chunk of humanity then it's not a general intelligence test unless you want to say most people aren't generally intelligent.

quonn · on April 17, 2023

Yeah these intelligence tests are not very good.

I would argue some programmers do in fact invent something new. Not all of them, but some. Perhaps 10%.

Second the point is not whether everyone is by profession an inventor but whether most people can be inventors. And to a degree they can be. I think you underestimate that by a large margin.

You can lock people in a room and give them a problem to solve and they will invent a lot if they have the time to do it. GPT will invent nothing right now. It‘s not there yet.

og_kalu · on April 17, 2023

>Yeah these intelligence tests are not very good.

Lol Okay

>And to a degree they can be. I think you underestimate that by a large margin.

Do i? Because i'm not the one making unverifiable claims here.

>You can lock people in a room and give them a problem to solve and they will invent a lot if they have the time to do it.

If you say so

hnfong · on April 18, 2023

> Not all of them, but some. Perhaps 10%.

Just listen to what you're saying:

- GPT isn't at human level because GPT isn't able to invent something new

- Not all programmers invent something new, but some. Perhaps 10%

I'm pretty sure this implies literally that 90% programmers aren't human level.

The lengths to which people are willing to go to dismiss GPT's abilities is mind boggling to me.

dataangel · on April 18, 2023

> Because it performs at least average human level (mostly well above average) on basically every task it's given.

No, GPT4 fails at some very basic tasks. It can't count letters passed 15.

og_kalu · on April 18, 2023

It doesn't see words.