Hacker News new | past | comments | ask | show | jobs | submit login

No, because they're already taking that into account.

>Metric: generation throughput (token/s) = number of the generated tokens / (time for processing prompts + time for generation).

(Though they're doing batching, so this is an unfair comparison. Would be interesting to get single batch speed.)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: