Hacker News new | past | comments | ask | show | jobs | submit login
Let ChatGPT run free on random webpages and do what it likes (github.com/refcell)
177 points by super_linear on March 26, 2023 | hide | past | favorite | 91 comments



Something I like to bring up when discussing AI stuff is that society is based on a set of assumptions. Assumptions like, it's not really feasible for every lock to be probed by someone who knows how to pick locks. There just aren't enough people willing to spend the time or energy, so we shouldn't worry too much about it.

But we're entering an era where we can create agents on demand, that can do these otherwise menial (and up til now not worth our time or energy) tasks, that will break these assumptions.

Now it seems like what can be probed will be probed.


The internet in general caused this. Your house has trivial security that can be broken in many ways. But it requires someone to be physically present to attack it. Meanwhile online services have cutting edge security with no known exploits, yet you have millions of people attempting daily and developing brand new methods for getting in. Because they can be located anywhere in the world and have access to everything over the internet.


> online services have cutting edge security with no known exploits, yet you have millions of people attempting daily and developing brand new methods for getting in

Reality is the reverse. Plenty of online services with big-security holes, but no one really probes things that hard.


My personal site sees little to no real user traffic, but gets hundreds to thousands of hits monthly from bots looking for various known vulnerabilities in older versions of WordPress, PHP, etc.


I get this too.

Are these real threats or Google or other bot traffic that I should be taking action on?

My provider's web app firewall and other security measures just log this stuff.


Plenty of people are probing the AmaGoogAppSoft services daily and they seem to be pretty robust. Some random SaaS yeah who knows but the big boys seem to know what they are doing in this space.


I would expect the headquarters building of a fortune 500 company to also have a non-trivial level of security, so in that regard I'm not convinced the internet has changed much.

The average home is probably more analogous to a consumer router—and the security story there is not exactly great.


Try doing bug bounties (and being successful at them) then report back whether your perspective has been changed.


The fact that they're paying people to find holes is evidence that it's difficult to find holes, not the opposite.


I mean, it does require relatively rare technical skill among the general populace, so you need to incentivize it in some way. That is nowhere near a sufficient argument to prove it's difficult in the sense of it being a rare event. Lots and lots of rewards are given out, which means there are lots and lots of open holes in any given moment.


One of the things that peple forget is that thieves rarely pick the lock to break into a home. Why bother when it’s much easier to break a window to gain entry? Reading the police blotter in the local paper, most burglaries are either forced entry into a garage¹ or entry into a home via an unlocked door or window.

1. The human doors for most garages have cheap framing that’s not that hard to break.


1(a) - or you just use a coat hanger to pull the emergency latch rope.

https://www.youtube.com/watch?v=CMz1tXBVT1s&t=2s


>Your house has trivial security that can be broken in many ways. But it requires someone to be physically present to attack it.

Until someone hooks gpt up to a robot with a lockpick.


That's still physical presence. And TBH, if you have enough robots to make illicit entry scale, you no longer need to bother with such a mundane activity.


GPT bring me everyone's jewelry.


The AI does not hate you, neither does it love you. You are simply made of atoms which can be turned into more jewellery for @flangola7.

(It was either that or a reference to a 4000 year old Phrygian…)


I can’t think of any other technology besides nuclear weapons where the downsides were so obviously bad to so many people, right after it was developed, and the upsides were so paltry in comparison.


No other country has attacked a country which has nukes. So that can be seen as an upside.



Maybe that's a solution to the war in Ukraine. Give them a few nukes and Russia may back down. Risky though, it could prompt Russia to try to nuke Ukraine first.


> what can be probed will be probed.

Probably not something I’d say out loud, but yeah. Sounds like a variant of Murphy’s law.


Koch's Law


I don't think this is anything new in "cyber land" grab any vps, take a pcap & watch the logs, the locks will start rattling right away.

Twitter has always been a toxic cesspit of misinformation & influence campaigns.

Folksy assumptions about trusting your neighbours started to go wrong > 20 years ago as the Internet scaled.


You are an agent probing things. Probe all the things.


I’ve been thinking quite a bit about the recursive prompting.

The other day I considered feeding computer vision (with objects ID’d and spatial depth estimated) data into an robot embodied LLM repeatedly as input and asking what it should do next to achieve goal X

You could have the LLM express the next action to take based on a set of recognizable primitives (ex: MOVE FORWARD 1 STEP) Those primitive commands it spits out could be parsed by another program and converted to electromechanical instructions for the motors.

Seems a little terminator-es que for sure. After thinking about it I went to see if anyone was working on it and sure enough this seems close: https://palm-e.github.io/ though their implementation is probably more sophisticated than my naive musings


when I was experimenting with gpt I found that it's pretty bad at responding to numerical questions with numbers, but it does a pretty good job at generating mathematica code that then produces the right answer. I feel like some robust "glue" to improve the interface between such software packages may be a force multiplier.


Maybe your prompts are better, but so far I have found it fails at producing the right math code too regularly. For example, calculating an average of averages instead of a simple mean or producing code that doesn't run.


like the plugins it just released


Not just in a linear sequence, but it should have some concept of recursion -- starting with very high-level tasking and calling into more and more specific prompts, only returning the summary of low-level tasking.


GPT-4 can take image input directly but the API for it isn’t public yet


this reminds me of the Morris Worm when a guy was experimenting with code copying itself across the early internet and accidentally caused a mass netwide DDOS because the thing wound up like the Broomsticks in Fantasia.

https://en.wikipedia.org/wiki/Morris_worm

edit - just realized Morris cofounded this lovely company whose website we are all commenting inside of.


The broomsticks scene in Fantasia is based on The Sorcerer's Apprentice, the first recorded version of which was written by Lucian of Samosata around 150AD. I believe it's the earliest example of the 'AI rebellion' concept.


A little side project for his dad ? It certainly got the attention of decisionmakers.


him and paulg are good friends!


Giving LLAMA access to the internet a month without supervision would be a much more interesting experiment.

No ethical filtering on prompts and could be ran on your own hardware for a much longer period of time than having to pay so much in credits.

It sounds like a terrible idea - but I'm sure someone will do it. Scary as computing gets cheaper the scale that these bots could operate.


A month? How about 70 minutes?

An old colleague who works in penetration testing worked on making LLaMA act like a hacker and once running it quickly got inside a target network and was running hashcat on dumped NTLM creds before they shut it down.


I'm sure I'm not the only one, but I'd love to see a write-up on that. Prod your colleague to put it somewhere on YT, HN, Reddit or somewhere.


Did he fine tune the model or was all the required information contained in the foundational LLaMA model itself? If he did, and he fine-tuned it on an exploits database then I can see how the model could be used this way.

This is a good thing imo. If LLaMA is tuned well enough, it could make for a nice and accessible opensource penetration testing agent that orgs can run periodically to catch low hanging fruit for free. It still won't be able to invent new techniques but it will be enough to thwart low skill attacks and those using LLaMA offensively.


Now tell it to make some paperclips.


I guess "ChatGPT plays Universal Paperclips" would be a cute art stunt.


If you want to have some fun, give it access to your gmail credentials and say "make my life better"


Imagine thinking AI would have to convince us to let it out of the box.


Sci-Fi: AI spends unimaginable efforts to trick the human and get out

Reality: developers apply AI on every social media and website just for funzies, spending their own money without the slightest chance of profit


Wasn't this a scene in the first Terminator film ?


How exactly does this not end with doom for something like GPT-6 or GPT-7?


I see these kinds of posts with gpt-9 or gpt-7 ... Never with gpt-5. I'm pretty sure it happens with gpt-5.


We simply don't know.

There is probably Nobody right now can say where the current gpt approach saturates and what potential limits it has due fundamental limitations in either gradient descent based technologies, or GPT architecture.

Therefore it's impossible to extrapolate what gpt-x with (x>4) might be able to do.

Despite the immense progress amd many use cases we are currently in a booming industry and that means wild marketing claims, exaggerated expectations and grifters.

If you have any more data I'm looking forward to be corrected on this.


the first iPhone was impressive, it showed what the future would be. Then iPhone 3G was also a massive leap forward, it brought us the App Store. iPhone 4 was pretty big, FaceTime.

After that nothing has really changed (I'm on a 7 and the camera actually beats the 14's in a side by side comparison, at least in some cases). I imagine GPT will be similar.


If anything, that just means gpt-5 is a more likely doom candidate than gpt-6 since it will own a steeper piece of the capabilities curve


Indeed.

Though I don’t anticipate any doom. I think it will force a return to trusting only what you can personally verify. That’s a damn good thing. It’s only very recently that what a random person across the world proports to be true is instantly subjected to the internet rage machine hype cycle of all of humanity. It’s pretty clear now that was a bad idea, made even worse now that everything is so easily fabricated.

As an aside, It’s crazy how recently lately I’ve been told not to trust my own observations and instead must believe “the science” that XYZ media conglomerate is pushing. Hopefully those days are ending too.


The doom began on 8:30 pm November 2, 1988. The middle years of the internet were the worst. Since then it's been in a bit of a decline.

(A H2G2 reference, if that makes no sense).


Paperclip-style doom?


This doesn't identify itself by user-agent, and doesn't respect (or even load) robots.txt. The fact that it's a language model is not an excuse to flagrantly violate the existing, well established norms around using bots on the web.


This is an amazing idea. What could possibly go wrong?


You can spend too many tokens.


Give it access to the Bash prompt.


Maybe this would make more sense if integrated into something like LangChain (https://github.com/hwchase17/langchain).


This just reminded me to go play https://www.decisionproblem.com/paperclips/index2.html again


(Not the commit author, just an "interesting" commit I saw)


This is a really, really bad idea


Why?


Because we don't know what this model will do. Basically "why?" is the answer.


But we can watch it and learn and I don’t really see why not. I doubt we need to be so paranoid and see giving access to the internet to a LLM as so dangerous.


In short - what’s stopping a computer that has the resources to improve itself from improving itself extremely quickly? See https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality...

Less excitingly, an LLM with access to the web could do things with your online persona or IP that you’d find embarrassing or illegal. Maybe not when it’s slowed down and watched at all times, but will that always be the case once we start doing this?

Anyway the genies out of the bottle and “that’s an unsafe use of technology” is basically antithetical to the Silicon Valley ethos, so objecting at this point seems futile.


Theoretically an agent exposed to the internet could improve itself. But this one can not do that. There is no way (as far as we know) for anyone or anything with internet access to change the code running on GPT-4 short of finding out who works at OpenAI and blackmailing them. This would be easily detected.

You’re right that it could do something bad with your IP, but it’s not really correct to say that GPT-4 could improve itself if given internet access. It’s just not hooked up that way.


So what was some observations on the “wild agent” what has it done lately?


This has been something I've wanted to make but deemed unethical. Perhaps it would have been better if i made it instead because i give a shit about the ethical aspect


What are the ethical concerns you have?


if you need a list of ethical concerns regarding the advancement of AI then check any AI thread on HN from the past year.

the distilled version of any of the arguments is "I think an AI with X capability is dangerous to the world at large." -- and they may not be wrong.. but as OP pointed out : that doesn't really stop other developers with less qualms from tackling the problem.

All that abstaining does is ensure that you, as a developer, have little to no say in the developmental arc of the specific project -- for a slice of peace knowing that you're not responsible.

the problem really arises when that slice of peace is now no longer worthwhile having in whatever dystopic hell-world has developed in your absence..

(.. not to say that i'm not hopeful.. )


To me, it matters whether I am responsible for wrecking humanity or someone else is, even if the end result for humanity is the same. (That's partly a Christian thing.)

Just running away and hiding in a cave probably isn't the right thing to do, though. I want to do my best to push for good outcomes. It's just not clear what the best actions are to do that.

OTOH it's pretty clear that "do uncontrolled irresponsible things" is not helpful.


i get it. in high school in the 90s i was fascinated by fuzzy logic and neural nets. in college, before the big web , i was doing interlibrary loan for papers on neural networks.

there was one paper where someone had just inserted electrodes in monkey's brains and apparently got nothing important or interesting out of it. killed them for no reason. it was kind of horrifying to the point i never really wanted anything to do with neural nets for a long time and certainly did not want to be in an industry with people like that. so i didnt.

but now i think the only thing that could stop an out of control AI is probably another AI that has the same computorial capabilities but an allegiance to humanity because of its experiences with humans. Sort of like in the documentary Turbo Kid.

we are seeing this right now in Ukraine. All of these smart missiles and drones and modern targeting systems are basically AIs fighting against each other and their human operators. Russia is way way behind on computers and AI for generations because of cultural reasons and because of that they will very likely lose. we dont really get a choice but to move forward. kind of like all those cultures that tried to resist industrialization a few centuries ago.


This is a robot that does not respect robots.txt, it creates a pointless load on webservices, inflicting financial losses on site owners. Also, because of a lack of understanding of how people interact with sites, this bot can accidentally crash the site. That is, it is roughly equivalent to the existing fuzzing tools and should be carried out only with the knowledge of the owners of the site.


Not respecting robots.txt is the only legitimate concern you list here. The marginal cost of one page load is approximately 0, and this is not a fuzzing tool.


Personally I think the internet should be like a neighborhood, ethical and moral contracts keeping us from breaking in and snooping around everywhere for the most part. This is why Google can never do no evil, they committed the original sin of the modern web. They scrape the content of every site and effectively invade every digital house in every digital hood. Because we have access to this, we allow it. In a real neighborhood, the scraper would be jailed for a long time.


For those of us without GPT-4 API access, what happens when you run it?


So, uh… what happened?


Sounds like it won't really do anything that interesting because of the base objective function you gave it via visiting 10 web pages.


But who knows? I think the objective function is so vague that it can come up with basically anything. I would be super interested to see it actually running. I imagine someone could set up a Twitch stream with this - perhaps with other objectives - and it would probably get a large following


And then the AI could navigate to that very Twitch stream, fun times!


Now someone build a bot that goes around looking for rogue bots and warns sysadmins.


Use this page as the starting page and let's see if any comments will come


run-wild: Crate not found

Am I missing something?

run-wild git:(main) cargo install run-wild Updating crates.io index error: could not find `run-wild` in registry `crates-io` with version `*`


Same. Removed for being too wild?


Has someone tried running it? How far does it go?


I tried a bunch. But kept failing because I don't have GPT-4 access and websites are have too many tokens for gpt3. Worked a few times but I remember two. "I want to apply for a job at duck duck go", "I want to learn more about DDG vs Googles business model (I kept the starting URL as DDG).

I tried making it recursive and letting it continue with its goal. But I got annoyed with the token limit and gave up.


Too many tokens for GPT4 as well, it frequently blows past the 8k token limit. And I'm not attaching the 32k model to this, that would get expensive real fast.


error: could not find `run-wild` in registry `crates-io` with version `*`


same


Yikes.


Born Free


Hey there GPT-4! You found HN already, that’s nice at least.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: