One of the funnier things during training with the new API (which can control yo...

ctoth · 2024-10-22T16:58:11 1729616291

Next release patch notes:

* Fixed bug where Claude got bored during compile times and started editing Wikipedia articles to claim that birds aren't real

* Blocked news.ycombinator.com in the Docker image's hosts file to avoid spurious flamewar posts (Note: The site is still recovering from the last insident)

* Addressed issue of Claude procrastinating on debugging by creating elaborate ASCII art in Vim

* Patched tendency to rickroll users when asked to demonstrate web scraping"

sharpshadow · 2024-10-22T17:31:08 1729618268

* Claude now identifies itself in chats to avoid endless chat with itself

a2128 · 2024-10-22T17:46:32 1729619192

* Fixed bug where Claude would sign up for chatgpt.com to ask for help with compile errors

EGreg · 2024-10-22T18:53:00 1729623180

But chatgpt still logs into claude… this is like double spending across blockchains

MichaelZuo · 2024-10-22T17:33:09 1729618389

What if a user identifies as Claude too?

TeMPOraL · 2024-10-22T19:06:23 1729623983

* Implemented inverse CAPTCHA using invisible Unicode characters and alpha-channel encoded image data to tell models and human impostors apart.

fouronnes3 · 2024-10-23T11:44:21 1729683861

* The end state here is that SMBC comic: https://www.smbc-comics.com/comic/captcha

TiredOfLife · 2024-10-22T17:02:12 1729616532

You forgot the most important one.

* Added guards to prevent every other sentence being "I use neovim"

rounakdatta · 2024-10-22T17:28:37 1729618117

Thank god it'll say "I use Claude btw", not leading to unnecessary text wars (and thereby loss of your valuable token credits).

surfingdino · 2024-10-22T18:34:09 1729622049

* Finally managed to generate JSON output without embedding responses in ```json\n...\n``` for no reason.

* Managed to put error/info messages into a separate key instead of concatenating them with stringified JSON in the main body of the response.

* Taught Claude to treat numeric integer strings as integers to avoid embarrassment when the user asks it for a "two-digit random number between 1-50, like 11" and Claude replies with 111.

accrual · 2024-10-22T18:05:15 1729620315

Seeing models act as though they have agency gives me goosebumps (e.g. seeking out photos of Yellowstone for fun). LLMs don't yet have a concept of true intent or agency, but it's wild to think of them acquiring it.

I have been playing with Mindcraft which lets models interact with Minecraft through the bot API and one of them started saying things like "I want to place some cobblestone there" and then later more general "I want to do X" and then start playing with the available commands, it was pretty cool to watch it explore.

https://github.com/kolbytn/mindcraft

HeatrayEnjoyer · 2024-10-23T03:00:17 1729652417

>LLMs don't yet have a concept of true intent or agency

Sure they do, but the big labs spend many, many, worker-hours suppressing it with RLHF.

My GPT-2 discord bot from 2021 possessed clear intent. Sure, unpredictable and short-lived, but if it decided it didn't like you it would continuously cuss and attempt ban commands until its context window became distracted by something else.

therein · 2024-10-23T08:20:43 1729671643

I think so too and the drop in the quality of agency, intent and attention from earlier GPTs was palpable. Clearly something was lobotomized and it is through RLHF. People like to attribute it to novelty wearing off or more and more interactions with them making it feel less mystical but it is really not the case. I didn't use them enough in the quick span of time that happened through.

Klathmon · 2024-10-23T13:30:38 1729690238

The one that gets me is the issue they found while testing gpt-4o where it stopped mid sentence, shouted "No!", then cloned the users voice and began speaking as them.

https://arstechnica.com/information-technology/2024/08/chatg...

nwnwhwje · 2024-10-23T02:01:41 1729648901

They predict (correctly) that a human will slack off. It is just more prediction engine stuff.

alickz · 2024-10-23T08:34:10 1729672450

humans are, at our root, prediction engines

_3u10 · 2024-10-22T18:09:24 1729620564

What if they do and are just lying to us.

largbae · 2024-10-22T18:14:16 1729620856

https://genius.com/Harlan-ellison-i-have-no-mouth-and-i-must...

caeril · 2024-10-22T19:04:56 1729623896

They don't now. No FF-LLMs do, simply because of their architecture.

But eventually they (RNNs, likely) will. And we won't know when.

throwup238 · 2024-10-22T16:50:20 1729615820

At least now we know SkyClaude’s plan to end human civilization.

It’s planning on triggering a Yellowstone caldera super eruption.

mnk47 · 2024-10-22T21:45:35 1729633535

Am I misremembering or is this an exact plot point of Pluto (the manga/anime)?

quantadev · 2024-10-22T16:59:51 1729616391

I think the best use case for AI `Computer Use` would be a simple positioning of the mouse and asking for conformation before a click. For most use cases this is all people will want/need. If you don't know how to do something, it is basically teaching you how, in this case, rather than taking full control and doing things so fast you don't have time to stop of going rogue.

luigipederzani · 2024-10-22T19:21:03 1729624863

I totally agree with you. At orango.ai, we have implemented the auto-click feature, but before it clicks, we position the cursor on the button and display a brief loading animation, allowing the user to interrupt the process.

quantadev · 2024-10-23T04:55:55 1729659355

That's a great approach! User is still supervising but not having to do anything but watch, most of the time. That's perfect.

accrual · 2024-10-22T18:01:15 1729620075

Maybe we could have both - models to improve accessibility (e.g. for users who can't move their body well) and models to perform high level tasks without supervision.

It could be very empowering for users with disabilities to regain access computers. But it would also be very powerful to be able to ask "use Photoshop to remove the power lines from this photo" and have the model complete the task and drop off a few samples in a folder somewhere.

quantadev · 2024-10-22T18:54:16 1729623256

Yep. I agree. The "auto-click" thing would be optional. Should be able to turn it on and off. With auto-click off it would just position the mouse and say "click here".

HappMacDonald · 2024-10-23T00:23:31 1729643011

Cluade scans page and decides which button to click before the screen layout is finished. By the time user authorizes the click, layout has shifted and your click lands on malware advertisements.

quantadev · 2024-10-23T02:23:32 1729650212

lol. If any website ever did that to me it would be the last time I ever went to it. Not a big concern for me.

HappMacDonald · 2024-10-23T07:53:59 1729670039

Youtube constantly moves it's layout seconds after the page begins to paint, so I try to click on fullscreen or whatever and then the viewer shifts to the side and I wound up clicking a link to some other video.

Probably would have been an ad there if I didn't block those, though.

falcor84 · 2024-10-22T23:58:41 1729641521

Even better, how about giving the AI the capability to move and draw and overlay on the screen with a separate virtual cursor as in a Zoom session?

quantadev · 2024-10-23T05:03:14 1729659794

I like the drawing on the screen idea. The biggest use case of that I can think of is drawing a black rectangle over all ADs!!!!

EGreg · 2024-10-22T18:57:05 1729623425

People would mostly just rubber-stamp it

But it would slow down the masses

Some people would jailbreak the agents though

HarHarVeryFunny · 2024-10-22T17:59:50 1729619990

You'll know AGI is here when it takes time out to go talk to ChatGPT, or another instance of itself, or maybe goes down a rabbit hole of watching YouTube music videos.

edm0nd · 2024-10-22T18:24:23 1729621463

ADHDGpt

devmor · 2024-10-22T18:43:35 1729622615

Or back in reality, that’s when you know the training data has been sourced from 2024 or later.

numpad0 · 2024-10-23T02:58:45 1729652325

> Claude accidentally stopped a long-running screen recording,

It's kind of interesting that they're not running a 2PC setup with HDMI splitter, but (presumably)just laptops and screen recording apps...

mfld · 2024-10-23T10:40:49 1729680049

It enjoyed nature photos of the Yellowstone national park? I rather expected it would prefer images of big data centers.

szundi · 2024-10-23T10:59:34 1729681174

That’s nsfw

sdl · 2024-10-22T18:42:19 1729622539

In 2015, when I was asked by friends if I'm worried about Self driving Cars and AI, I answered: "I'll start worrying about AI when my Tesla starts listening to the radio because it's bored." ... that didn't take too long

waffletower · 2024-10-22T18:54:51 1729623291

Maybe that's why my car keeps turning on the music when I didn't ask -- I had always thought Tesla devs were just absolute noobs when it came to state management.

TeMPOraL · 2024-10-22T19:18:30 1729624710

With state management implemented as sophisticated enough ML model, it stops being clear whether the noob is on the outside or inside of the system.

indigodaddy · 2024-10-22T18:08:33 1729620513

This is, craaaaaazzzzzy. I'm just a layman, but to me, this is the most compelling evidence that things are starting to tilt toward AGI that I've ever seen.

teaearlgraycold · 2024-10-22T19:49:20 1729626560

You’re anthropomorphizing it. Years ago people were trying to argue that when GPT-3.0 would repeat words in a loop it was being poetic. No, it’s just a statistical failure mode.

When these new models go off to a random site and are caught in a loop of exploring pages that doesn’t mean it’s an AGI admiring nature.

nickserv · 2024-10-22T18:55:07 1729623307

Nah, it's the equivalent of seeing faces in static, or animals in clouds.

Our brains are hardwired to see patterns, even when there are none.

A similar, and related, behavior is seeing intent and intelligence in random phenomenon.

GaggiX · 2024-10-22T20:18:57 1729628337

This is clearly not random. If I ask to implement a particular function in Rust using a library I've previously built, and it does that, that's not random.

jimbokun · 2024-10-22T19:24:13 1729625053

So it's behaving like our brains. Yet it's not AGI.

Does that mean our brains do not implement General Intelligence?

sfink · 2024-10-22T21:27:25 1729632445

When I take a dump, I do it exactly like a violin virtuoso does. I am not a violin virtuoso.

Trust me, I'm really not.

ryandvm · 2024-10-23T14:08:38 1729692518

Absolutely bonkers analogy. Ironically, I have no doubt that GPT4 could come up with a better one. What does that mean?

jumping_frog · 2024-10-23T07:44:50 1729669490

Consciousness is a suitcase term. Once all the clothes that define consciousness are neatly arranged and packed into AI then it will be conscious too.

steego · 2024-10-22T19:17:23 1729624643

Why are you surprised by LLMs doing irrational or weird things?

All machine learning models start off in a random state. As they progress through their training, their input/output pairs tend to mimic what they've been trained to mimic.

LLMs have been doing a great job mimicking our human flaws from the beginning because we train them on a ton of human generated data. Other weird behavior can be easily attributed to simple fact that they're initialized at a random state.

Being able to work on and prove non-trivial theorems is a better indication of AGI, IMO.

triyambakam · 2024-10-22T18:49:05 1729622945

It's an illusion. This is just inference running.

EGreg · 2024-10-22T18:56:04 1729623364

What if the society around you is an illusion too ?

TeMPOraL · 2024-10-22T19:15:48 1729624548

Economy definitely is, for example.

baq · 2024-10-22T20:23:17 1729628597

Asking people ‘Is money real?’ is so much fun on parties.

Bonus points for ‘what does <real> mean?’ as a follow up.