The speed of light is always going to be a limitation for these kinds of applications . I dont see how the latency can be reduced enough to make this close enough to a live jamming session . Considering your 30 ms threshold it takes light that much time to travel 10,000 kilometers . Could we perhaps use some of the predictive techniques that games like Counter strike use ?
You could maybe use that to smooth out envelopes, but whether a note is played or not isn't something you could possibly predict. I mean, you could predict the most likely note to be played next based on some musical analysis, but there's no way you could get it accurate enough to not compromise the surprises that make music so interesting.
If we could get enough bits to work with quantum entanglement then sure. But regardless to get information out of a quantum entanglement you have to send a basis of measurement before you can decode the data so you are still bound by the speed of light.