Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My current dream is a model that's good at coding with a ~10m token content window. I understand Llama 4 has a window approximately that size, but I'm hearing mixed results on its coding capacity.

If it had deep research and this, with a large number of API requests, I'd consider $200/month.



Has anyone found the output at these large context windows usable at all?

IME the quality of all models goes down considerably after just a few thousand tokens. The chances of hallucinating, mixing up prompts, forgetting previous prompts, etc., are much more likely as context size increases. I couldn't imagine a context of 1M tokens, let alone 10M, being usable at all. Not to mention that any query is going to come to a crawl just to move that amount of data around (which still annoyingly happens on every query...).

So usually at around 10K tokens I ask it to summarize what was discussed, or I manually trim down the current state, and start a new fresh chat from there. I've found this to work much better than wasting my time fighting bad output. This is also cheaper if you're on a metered plan (OpenRouter, etc.).


The results are not mixed, Llama 4 is terrible at coding. I agree on longer context window being the dream.


I mean you get a 2 Million token context window and by far my favorite coding model with Gemini 2.5 Pro.


I just subscribed to the free trial yesterday, and I've been hooked tbh. I haven't subscribed to any of the other LLM companies so far. I hope something else comes out within a month because I really don't want to spend 22 Euro per month for it.

The 1M context window (2M?) really sets it apart.


I believe you can still use Gemini 2.5 Pro for free via https://aistudio.google.com and their gemini-2.5-pro-exp-03-25 model ID through their API.

The free tier is "used to improve our products", the paid tier is not.


22 euro per month is less than 1 per day. Less than one espresso.

I get the subscription fatigue, but there are splurges and there are truly valuable things.


Has someone tried the 2m context window for a code base and can report how it compares over claude or o1?


Made a video comparing Gemini 2.5 Pro to Claude Sonnet 3.7 recently: https://www.youtube.com/watch?v=AVdVJ_hD_vo


I mean I've tried it with Gemini 2.5 Pro + Roo and then tried Claude 3.7 + Roo on the same task and Gemini blew Claude away. Haven't spent anymore OpenRouter credits, because Gemini was so much better.


Does Gemini have a web interface similar to claude.ai? I am lazy[1], but I am also poor. I would not be able to afford 100 USD per month.

[1] But if it is cheap enough, has large context window, then I might consider setting up something akin to claude.ai with Gemini's API.


Yeah AI Studio is free with decent rate limits, though obviously more developer focused: https://aistudio.google.com/

The official Gemini app works well for me too and there's a nice free tier and it's free if you have a newer Pixel phone. Otherwise $20/month for the Advanced tier. There's no $200/month option.

https://gemini.google.com/app


There's also Google's https://idx.dev - which is a webide/vscode dealio and you can use gemini in agentic mode (mix of 2.0/2.5 but if you use your own gemini key you can guarantee 2.5 Pro i think)

edit, well it now appears to be https://firebase.studio/ - that is a recent change I haven't used it since it changed its name..


I mostly use LLMs on PC, as I use LLMs mainly for coding.

Does AI Studio allow you to have projects with project files and whatnot?

How about its context window length, more or less than Claude's?

I am also interested in open-source alternatives to the web interface that claude.ai has, I know there are some but I have forgotten their names, would be cool to have a list here.


The best open source UI I know of is https://openwebui.com/ - you can point it at any OpenAI API compatible endpoint and both Gemini and Anthropic offer those now.

You can use the Gemini API for free with quite generous allowances, including for 2.5 Pro.


Thanks Simon, will take a look.

Extremely off-topic: are you still around DS?


DS?


DarkScience's IRC server.


Wow that takes me back! I've not been active on IRC in about a decade I'm afraid.


So we have talked a decade ago?! Damn! I remember you from DS. :D


AI studio is only developer focused if you’re not working on AI, which is a prohibited use case according to the Gemini API / AI Studio “Additional Terms”




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: