Hacker News new | past | comments | ask | show | jobs | submit login

Out of interest, are you worried that OpenAI would go against their API license terms and train on your data anyway, or are you worried that they might log your data and then have a security breach that exposes it to malicious attackers?



I think people simply worry calling Open AI on a lower price plan would cause the data to be scan for training purposes.


Their API terms and conditions say they won't do that.

I'm fascinated by how little people trust them!


Terms and conditions only mean something if you have the money and patience to hold someone’s feet to the fire.

If I’m a CTO figuring out how to enable my team, I care a great deal about whether or not our private code is going to OpenAI.


I'm confident they don't want your code in their training data. The amount they have to lose if they're found to be using customer code as training data is enormous. Plus there are no guarantees that your code is good for training a model - model providers have been focusing much more heavily on quantity rather than quality of training data recently.

(Worrying that they may log your data and then have a security breach is a different matter - that's a reasonable concern, they've had security bugs in the past.)

I call this the AI trust crisis: people absolutely won't believe AI companies that say they won't train on their data: https://simonwillison.net/2023/Dec/14/ai-trust-crisis/


Quality over quantity, rather?


Yes, that's what I meant! Too late to edit now.


This is a great read Simon.


All of the above. I’m not overly worried. But it’s surprising that they don’t mention it anywhere.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: