Thanks for bringing llmprices.dev to my attention. I have also a comparison page for models hosted on OpenRouter (https://minthemiddle.github.io/openrouter-model-comparison/), I do comparison via regex (so "claude-3-haiku(?!:beta)|flash" will show you haiku, but not haiku-beta vs flash.
I wish that OpenRouter would also expose the amount of output tokens via API as this is also an important criteria.
Yeah we want to do exactly this, benchmark and add more data from differnt gpus/cloud providers, will appreciate your help a lot!
There are many inference engines which can be tested and updated to find best inference methods
Goodluck, companies would love that. Don't get depressed unlike my tool I think you should charge, that might keep you motivated to keep doing the work.
It's a lot of work, your target users is companies that use Runpod and AWS/GCP/Azure, not Fireworks and Together, they are in the game of selling tokens, you are selling the cost of running seconds on GPUs.
This is true especially if you are deploying custom or fine-tuned models. Infact, for my company i also ran benchmark tests where we tested cold-starts, performance consistency, scalability, and cost-effectiveness for models like Llama2 7Bn & Stable Diffusion across different providers - https://www.inferless.com/learn/the-state-of-serverless-gpus... Can save months of evaluation time. Do give it a read.
I doubt we would have clicked on it if they called it a Modern Alternative to Yahoo! Directory. You are correct though.
Can you allow people to add sites they found, for example adding something like Hydra or Citus would require you to know about them.
This is perfect though, cause I'm always finding fascinating tools online that I either favorite on hn or star on Github but not all the time there's also a rust serverless company that I can't recall the name of, which fits your use case nicely.
Last year Mistral watched as every provider host their models with little to no value capture.
Nemo is Apache 2.0 license, they could have easily made that a Mistral Research License model.
It's hard to pitch vc for more money to build more models when you don't capture anything making it Apache 2.0.
Not everyone can be Meta.
Magnet links are cute but honestly, most people rather use HF to get their models.