Does this also mean that it would be possible to train on parallel GPU-poor setup instead of needing lots of GPU memory / bandwidth on one computer?
Does this also mean that it would be possible to train on parallel GPU-poor setup instead of needing lots of GPU memory / bandwidth on one computer?