99.99% seems off by orders of magnitude to me. I don't have an exact number but ...

roflyear · on April 14, 2023

I think it's way better than 70%, probably 95%+ even with bad data and poor prompts. I'd have to run more numbers but it's definitely better than 70%.

You can get to 99.9%+ with good data and well designed prompts. I'm sure it would be above 90% even with almost intentionally bad prompts, tbh.

iudqnolq · on April 15, 2023

It's definitely not that good if we share a definition of poor data/prompts.

This afternoon I tried to use Codium to autocomplete some capnproto Rust code. Everything it generated was totally wrong. For example, it used member functions on non-existent structs rather than the correct free functions.

But I'll give it some credit: that's an obscure library in a less popular language.

roflyear · on April 15, 2023

> This afternoon I tried to use Codium to autocomplete some capnproto Rust code.

This isn't what I said at all. I said with summarizing data.

mrbombastic · on April 14, 2023

I don’t have hard numbers but anecdotally hallucinating has gone down significantly with gpt4, it certainly still happens though.