Hacker Newsnew | past | comments | ask | show | jobs | submit | aray07's commentslogin

Yeah this is a bit crazy and not surprising at all.

The limits have always been opaque and you never know when they change.

I started building an open-source local proxy that logs every rate-limit header Claude Code sends.

I am using it to track and get a better sense of the 5h and 7d weekly limits.

Some initial data from 11 observed 5h sessions on Max 20x: - 5h budget: roughly $120–$280 per window - 7d budget: roughly $1,300–$1,900 - Separate Sonnet-only 7d budget at ~$150 - 95% of tokens are cache reads. They barely move the meter.

It’s open source so more people can run it and we can figure out the real numbers.

https://github.com/abhishekray07/claude-meter


Subscription business models have become a new art form thanks to Anthropic

you can choose your own model in claude code and it generally defaults to Opus

Yeah, but that's just the model the main agent uses. The subagents aren't Opus. They are Haiku and Sonnet. Most of the token heavy work is offloaded to subagents because of this.

Yeah - its definitely a new way of working and getting used to!

The dropping requirements problem is real. What's helped us is breaking the spec into numbered ACs and having the verification run per-criterion. If AC-3 fails you know exactly what got dropped.


I'll try that out, thanks for the tip!


I do it per feature, not per step. Write the AC for the whole feature upfront, then the agent builds against it. I haven't added a spec-validation step before coding but that's a good idea. Catching ambiguity in the spec before the agent runs with it would save a lot of rework


Agreed. The spec file is context. Writing acceptance criteria before you prompt provides the context the agent needs to not go off in the wrong direction. Human leverage just moved up and the plan/spec is the most important step.

Parallelism on top of bad context just gets you more wrong answers faster


This is great. The tests in this case are the spec. When you give the agent something concrete to fail against, it knows what done looks like.

The problem is if you skip that step and ask Claude to write the tests after.


i think the friction has moved upstream - now it's working on the right thing and specifying what correct looks like. i don't think we are going back to a world where we will write code by hand again.


Unless what you want to do isn't well represented in the training set.


yup, agree - i spend most of my time reviewing the spec. The highest leverage time is now deciding what to work on and then working on the spec. I ended up building the verify skill (https://github.com/opslane/verify) because I wanted to ensure claude follows the spec. I have found that even after you have the spec - it can sometimes not follow it and it takes a lot of human review to catch those issues.


Test theatre is exactly the right framing. The tests are syntactically correct, they run, they pass but do they actually prove anything?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: