Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
mcharytoniuk
8 months ago
|
parent
|
context
|
favorite
| on:
Show HN: Open-source load balancer for llama.cpp
Yes, exactly. You can split the available context into "slots" (chunks) so it can handle multipe requests concurrently. The number of them is configurable.
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: