Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One thing that sticks out to me is that there is an incorrect assumption from the journalists that having the API keys to an LLM can lead to injecting data.

People still don’t know how LLMs work and think they can be trained by interacting with them at the API level.



> People still don’t know how LLMs work and think they can be trained by interacting with them at the API level.

Unless they are logging the interactions via the API, and then training off those logs. They might assume doing so is relatively safe since all the users are trustworthy and unlikely to be deliberately injecting incorrect data. In which case, a leaked API key could be used to inject incorrect data into the logs, and if nobody notices that, there’s a chance that data gets sampled and used in training.


Nobody really trains directly from logs without curation and filtering.


Sure, but there is a non-zero risk that some malicious data could slip through the curation and filtering processes undetected

I agree that’s unlikely, but not astronomically unlikely


Considering the costs involved in fine-tuning, nobody does it unless they are a very rich corporation. And certainly not for public-facing models…


unless I somehow skimmed over it, they only appear to refer to "prompt injection"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: