No, because they're already taking that into account. >Metric: generation throug...

No, because they're already taking that into account.

>Metric: generation throughput (token/s) = number of the generated tokens / (time for processing prompts + time for generation).

(Though they're doing batching, so this is an unfair comparison. Would be interesting to get single batch speed.)