One thing that sticks out to me is that there is an incorrect assumption from th...

skissane · 2025-05-02T08:35:12 1746174912

> People still don’t know how LLMs work and think they can be trained by interacting with them at the API level.

Unless they are logging the interactions via the API, and then training off those logs. They might assume doing so is relatively safe since all the users are trustworthy and unlikely to be deliberately injecting incorrect data. In which case, a leaked API key could be used to inject incorrect data into the logs, and if nobody notices that, there’s a chance that data gets sampled and used in training.

rcarmo · 2025-05-03T17:46:29 1746294389

Nobody really trains directly from logs without curation and filtering.

skissane · 2025-05-04T00:43:17 1746319397

Sure, but there is a non-zero risk that some malicious data could slip through the curation and filtering processes undetected

I agree that’s unlikely, but not astronomically unlikely

rcarmo · 2025-05-05T07:16:06 1746429366

Considering the costs involved in fine-tuning, nobody does it unless they are a very rich corporation. And certainly not for public-facing models…

drilbo · 2025-05-02T20:16:28 1746216988

unless I somehow skimmed over it, they only appear to refer to "prompt injection"