I have seen numerous posts of llm q&a and by the time people try to replicate th...

cscurmudgeon · on Sept 9, 2023

> actively conspiring to present falsified results for gpt4 to discredit OpenAI

All this would be solved if OpenAI were a bit more open.

insulanus · on Sept 9, 2023

It would be nice if the organizations would publish a hash of the code and the trained dataset.

seabass-labrax · on Sept 9, 2023

You aren't able to get access to the 'Open'AI dataset though, are you? Agreed, it would be an excellent addition for comparing source-available models, but that doesn't help with the accusations of OpenAI's foul play nor of the existence of an anti-OpenAI conspiracy.

pulvinar · on Sept 9, 2023

GPT-4 (at least) is explicit in saying that it's learning from user's assessments of its answers, so yes, the only valid way to test is to give it a variation of the prompt and see how well that does. GPT-4 failed the "Sally" test for the first time after 8 tries when I changed every parameter. It got it right on the next try.

dandiep · on Sept 9, 2023

It’s important to remember that GPT4 is only deterministic at the batch level because it is a mixture of experts model. Basically every time you invoke it, your query could get routed to a different expert because of what else is in the batch. At least this is my understanding based on others analysis.

tarruda · on Sept 9, 2023

> because it is a mixture of experts model

Do you have a source for this? I also considered but never saw any evidence that this is how GPT 4 is implemented.

I've always wondered how a system of multiple specialized small LLMs (with a "router LLM" in front of all) would fare against GPT4. Do you know if anyone is working on such a project?

0xcde4c3db · on Sept 9, 2023

Or people post outliers because they're more interesting.