nitpick because the slant of your argument is correct, but in a self-hosted system you don't necessarily need a single machine to encode 10 video feeds, each client (assuming this is a P2P approach) only needs to encode one - the user's. It does need to transcode that to 9 outbound streams, which is asking quite a lot of even current high end laptop hardware, and basically impossible on a smartphone.
Add in the unrealistic upstream bandwidth needs and, yeah, your point still stands.
Add in the unrealistic upstream bandwidth needs and, yeah, your point still stands.