Yes, exactly. You can split the available context into "slots" (chunks) so it can handle multipe requests concurrently. The number of them is configurable.
Currently, it is a single instance in memory, so it doesn't transfer state. HA is on the roadmap; only then will it need some kind of distributed state store.
Local states are reported by the agents installed alongside llama.cpp to the load balancer. That means they can be dynamically added and removed; it doesn't need a central configuration.