>Metric: generation throughput (token/s) = number of the generated tokens / (time for processing prompts + time for generation).
(Though they're doing batching, so this is an unfair comparison. Would be interesting to get single batch speed.)
>Metric: generation throughput (token/s) = number of the generated tokens / (time for processing prompts + time for generation).
(Though they're doing batching, so this is an unfair comparison. Would be interesting to get single batch speed.)