A new 7B model, Snorkel-Mistral-PairRM-DPO, using a similar self-rewarding pipel...

spidersenses · on Jan 25, 2024

>Snorkel-Mistral-PairRM-DPO

The naming of these models is getting ridiculous...

column · on Jan 25, 2024

I kind of disagree. It's not "user friendly" but it is very descriptive. They are codenames afterall. Take "dolphin-2.6-mistral-7b-dpo-laser" for instance : with a little LLM background knowledge, just from the name you know it is a 7 billion parameters model based on Mistral, with a filtered dataset to remove alignment and bias (dolphin), version 2.6 and using the techniques described in the Direct Preference Optimization (https://arxiv.org/pdf/2305.18290.pdf) and Laser (https://arxiv.org/pdf/2312.13558.pdf) papers to improve its output.

salad-tycoon · on Jan 25, 2024

And here I was thinking they were somehow using the first three words from my Bitcoin wallet.

spidersenses · on Jan 25, 2024

Thank you for a great and informative explanation despite my somewhat ignorant take.

I'm an occasional visitor to huggingface, so I'm actually superficially familiar with the taxonomy. I just felt like, even if I tried to satirize it, I wouldn't be able to come up with a crazier name. And that's not even the end of the Cambrian explosion of LLMs.

Loic · on Jan 25, 2024

A bit like the User-Agent string.

mark_l_watson · on Jan 25, 2024

Thanks for this, and the link you provided below for GGUF files! I just cleared my schedule this afternoon to kick the tires.

azinman2 · on Jan 25, 2024

I assume this doesn’t yet run on llama.cpp?

lhl · on Jan 25, 2024

here are some GGUFs https://huggingface.co/brittlewis12/Snorkel-Mistral-PairRM-D...

tarruda · on Jan 25, 2024

It is based on Mistral which llama.cpp supports, so I assume it does run (you might need to convert to GGUF format and quantize it).