Thanks for the link! I don't think that really addresses my concern, though.
My point is that these LLMs are basically incredibly large programs that defy analysis with our current tools. Sure, I can poke it a few times and see that it usually does what I want, but that's not the same as saying it never goes off the rails.
If it does something crazy like post my bank login online, even only once in a billion times, that's still orders of magnitude higher than I'm willing to accept.
You’re basically asking me to prove to you that I can’t fly.
I will say it like this: it is highly improbable that I can fly. I cannot come up with a way to prove it to you. There is some sort of epistemic miscalculation going on if you operate under the assumption that I might be able to fly.
My point is that these LLMs are basically incredibly large programs that defy analysis with our current tools. Sure, I can poke it a few times and see that it usually does what I want, but that's not the same as saying it never goes off the rails.
If it does something crazy like post my bank login online, even only once in a billion times, that's still orders of magnitude higher than I'm willing to accept.