I don't really get who these are for - do people use them in their projects? I d...

vatican_banker · on July 10, 2024

> Trained routers are provided out of the box, which we have shown to reduce costs by up to 85%

The answer is here. This is a cost-saving tool.

All companies and their moms want to be in the GenAI game but have strict budgets. Tools like this help to keep GenAI projects within budget.

rodrigobahiense · on July 10, 2024

For the company I work for, one of the most important aspects is ensuring we can fallback to different models in case of content filtering since they are not equally sensitive/restrict.

veb · on July 10, 2024

From what I understand, it's from people using it in their workflows - say, Claude but keep hitting the rate limits, so they have to wait until Claude says "you got 10 messages left until 9pm", so when they hit that, or before they switch to (maybe) ChatGPT manually.

With the router thingy, it keeps a record, so you know every query where you stand, and can switch to another model automatically instead of interrupting workflow?

I may be explaining this very badly, but I think that's one use-case for how these LLM Routers help.

Kiro · on July 10, 2024

I don't think that's a use case since you don't get rate limited when using the API.

Onawa · on July 10, 2024

We get rate limited when using Azure's OpenAI API. As a gov contractor working with AI, I have limited means for getting access to frontier LLMs. So routing tools that can fail over to another model can be useful.

fkyoureadthedoc · on July 10, 2024

Same. Initially we just load balanced between various regions, ultimately bought some PTUs.

kordlessagain · on July 10, 2024

Anthropic Build Tier 4: 4,000 RPM, 400,000 TPM, 50,000,000 TPD for Claude 3.5 Sonnet

PiRho3141 · on July 10, 2024

This is for applications that use LLMs or Chat GPT via API.

brandall10 · on July 10, 2024

You may have a variety of model types/sizes, fine tunes, etc, that serve different purposes - optimizing for cost/speed/specificity of task. At least that's the general theory with routing. This one only seems to optimize for cost/quality.

killerstorm · on July 11, 2024

Yeah. I can't even consistently get JSON output out of all models. What are people doing that they don't care about output format?...

monarchwadia · on July 10, 2024

I think a lot of people are just interested in hitting the LLM without any bells or whistles, from Typescript. A low level connector lib would come in handy, yeah? https://github.com/monarchwadia/ragged