A lot of talk about how much cheaper it is than all other models. It remains to ...

bangaladore · on Jan 2, 2025

Per available providers on OpenRouter right now:

DeepSeek - 0.14$ per million tokens input, 0.28$ million tokens output (66 tokens per/s)

Fireworks - 0.9$ per million tokens input, 0.9$ million tokens output (23 tokens per/s)

DeepInfra - 1$ per million tokens input, 2$ million tokens output (1.27 tokens per/s)

Compared to Llama 3.1 405B (smaller model than this afaik):

Cheapest is 0.8/0.8$ at 24 t/s all the way to 4$/4$ at 8 t/s

So third party cost seems similar, but there aren't many people hosting DeepSeek right now.

Arcuru · on Jan 2, 2025

Just a minor clarification, DeepSeek's pricing for this model is temporary to match their previous model. They announced [1] that it will be the following after February 8:

DeepSeek - 0.27$ per million tokens input, 1.10$ million tokens output (66 tokens per/s)

Still much cheaper than the others though for input pricing.

[1] https://api-docs.deepseek.com/news/news1226#-api-pricing-upd...

wtn2s · on Jan 7, 2025

This is at peasant 8k context size too.

bangaladore · on Jan 2, 2025

Yeah, I was assuming they are selling for cheap to get people to try the model.

But still certainly cheaper than everyone else at the moment.

danielvf · on Jan 2, 2025

Given that we have open weights on it, the costs to run it relative to other open source models are fairly transparent.

stonebraker · on Jan 7, 2025

True, but the cost for other models will continue to go down as well.

RockyMcNuts · on Jan 2, 2025

exactly ... Gemini 2.0 Flash ranks better on quality, is faster, and cheaper if you assume same pricing as 1.5 (might go down).

these models are being commoditized.

https://artificialanalysis.ai/models/deepseek-v3

maeil · on Jan 3, 2025

For what it's worth, as always 99% benchmarks are very unreliable and per-task performance still greatly differs per model, with plenty of cases where results are wildly different.

I have a task I use in my work where Gemini 1.5-Pro is SOTA. Handily beating o1, Sonnet-3.5, Gemini-exp and everyone else, very consistently and significantly.

The newer/bigger models are better at reasoning and especially coding, but there's plenty of tasks that have little overlap with those skills.