2oMg3YWV26eKIs's comments

2oMg3YWV26eKIs · 2025-07-17T19:17:14 1752779834

The security risks with this sound scary. Let's say you give it access to your email and calendar. Now it knows all of your deepest secrets. The linked article acknowledges that prompt injection is a risk for the agent:

> Prompt injections are attempts by third parties to manipulate its behavior through malicious instructions that ChatGPT agent may encounter on the web while completing a task. For example, a malicious prompt hidden in a webpage, such as in invisible elements or metadata, could trick the agent into taking unintended actions, like sharing private data from a connector with the attacker, or taking a harmful action on a site the user has logged into.

A malicious website could trick the agent into divulging your deepest secrets!

I am curious about one thing -- the article mentions the agent will ask for permission before doing consequential actions:

> Explicit user confirmation: ChatGPT is trained to explicitly ask for your permission before taking actions with real-world consequences, like making a purchase.

How does the agent know a task is consequential? Could it mistakenly make a purchase without first asking for permission? I assume it's AI all the way down, so I assume mistakes like this are possible.

DanHulton · 2025-07-17T20:19:33 1752783573

There is almost guaranteed going to be an attack along the lines of prompt-injecting a calendar invite. Those things are millions of lines long already, with tones of auto-generated text that nobody reads. Embed your injection in the middle of boring text describing the meeting prerequisites and it's as good as written in a transparent font. Then enjoy exfiltrating your victim's entire calendar and who knows what else.

WXLCKNO · 2025-07-17T21:54:05 1752789245

In the system I'm building the main agent doesn't have access to tools and must call scoped down subagents who have one or two tools at most and always in the same category (so no mixed fetch and calendar tools). They must also return structured data to the main agent.

I think that kind of isolation is necessary even though it's a bit more costly. However since the subagents have simple tasks I can use super cheap models.

__jonas · 2025-07-17T23:56:20 1752796580

What isolation is there? If a compromised sub agent returns data that gets inserted into the main agents context (structured or not) then the end result is the same as if the main agent was directly interacting with the compromising resource is it not?

itsalotoffun · 2025-07-18T10:13:10 1752833590

Exactly. You can't both give the model access AND enforce security. You CAN convince yourself you've done it though. You see it all the time, including in this thread.

seunosewa · 2025-07-18T19:05:23 1752865523

Perhaps a reference to the data can be inserted in prompt. thee key or filename

clbrmbr · 2025-07-18T12:01:35 1752840095

And the way Google calendar works right now, it automatically shows invites on your calendar, even if they are spam. That does not bode well for prompt injection.

threecheese · 2025-07-17T21:36:21 1752788181

Many of us have been partitioning our “computing” life into public and private segments, for example for social media, job search, or blogging. Maybe it’s time for another segment somewhere in the middle?

Something like lower risk private data, which could contain things like redacted calendar entries, de-identified, anonymized, or obfuscated email, or even low-risk thoughts, journals, and research.

I am Worried; I barely use ChatGPT for anything that could come back to hurt me later, like medical or psychological questions. I hear that lots of folks are finding utility here but I’m reticent.

anointedbeard · 2025-07-18T01:08:36 1752800916

>I barely use ChatGPT for anything that could come back to hurt me later, like medical or psychological questions

I use ollama with local LLMs for anything that could be considered sensitive, the generation is slower but results are generally quite reasonable. I've had decent success with gemma3 for general queries.

0xDEAFBEAD · 2025-07-17T20:11:01 1752783061

Anthropic found the simulated blackmail rate of GPT-4.1 in a test scenario was 0.8

https://www.anthropic.com/research/agentic-misalignment

"Agentic misalignment makes it possible for models to act similarly to an insider threat, behaving like a previously-trusted coworker or employee who suddenly begins to operate at odds with a company’s objectives."

j_timberlake · 2025-07-18T02:34:20 1752806060

Create a burner account for email/calendar, that solves most of those problems. Nobody will care if the AI leaks that you have a dentist appointment on Tuesday.

ytpete · 2025-07-18T20:58:17 1752872297

But isn't the whole supposed value-add here that it gets access to your real data? If you don't want it to get at your calendar, you could presumably just not grant it access in the first place – no need for a fake one. But if you want it to automatically "book me a haircut with the same person as last time in an afternoon time slot when I'm free later this month" then it needs access to your real calendar and if attacked it can leak or wreck your real calendar too. It's hard to see how you can ever have one without the other.

FergusArgyll · 2025-07-17T19:51:21 1752781881

I agree with the scariness etc. Just one possibly comforting point.

I assume (hope?) they use more traditional classifiers for determining importance (in addition to the model's judgment). Those are much more reliable than LLMs & they're much cheaper to run so I assume they run many of them

crowcroft · 2025-07-17T21:05:42 1752786342

Almost anyone can add something to people's calendars as well (of course people don't accept random invites but they can appear).

If this kind of agent becomes wide spread hackers would be silly not to send out phishing email invites that simply contain the prompts they want to inject.

yard2010 · 2025-07-18T06:17:51 1752819471

The asking for permission thing is irrelevant. People are using this tool to get the friction in their life to near zero, I bet my job that everyone will just turn on auto accept and go for a walk with their dog.

pradn · 2025-07-17T21:51:09 1752789069

I can't imagine voluntarily giving access to my data and also being "scared". Maybe a tad concerned, but not "scared".

casey2 · 2025-07-18T19:17:36 1752866256

You should treat all email as public

2oMg3YWV26eKIs · on March 20, 2024

<3

Thanks for your continuing work on memcached! I'd be very curious how garnet's benchmarks compare with memcached.

2oMg3YWV26eKIs · on Nov 17, 2023

I just tried donating at https://signal.org/donate/

It seems that with uBlock origin enabled in Firefox, I was unable to fill out either of the 2 donation forms on the page. It wouldn't let me fill in my Name in the first form, nor would it let me enter a custom amount in the 2nd form.

Disabling uBlock origin seems to resolve.

2oMg3YWV26eKIs · on May 30, 2023

Mullvad used to have a "how to" guide for torrenting on VPN. But now it 404s: https://mullvad.net/en/help/bittorrent/

According to wayback machine, they deleted the page sometime mid 2021. Here's an archived version of the page: https://web.archive.org/web/20210513051214/https://mullvad.n...