It did talk about being adaptive. So if you are just listening to music it should be able to use large buffers. However if you switch to something with low-latency demands it can start using smaller buffers.
My main concern is that without rewriting how can you handle pressing play or pause? Sure, that music isn't realtime and can use large buffers but if I start playing something else, or stop the music I still want it to be responsive which may require remixing.
Except a typical desktop system is usually a mix of low latency and high latency audio streams. You're playing music, and you're typing on a 'clacky' virtual keyboard. The user doesn't want 100ms of lag with each finger tap till they hear the audible feedback. Yet when no typing is happening, the CPU doesn't want to be waking up 10x per second just to fill audio buffers.
The solution is to fill a 5 minute buffer with 5 minutes of your MP3 and send the CPU to sleep, and then if the user taps the keyboard, rewind that buffer, mix in the 'clack' sound effect, and then continue.
My main concern is that without rewriting how can you handle pressing play or pause? Sure, that music isn't realtime and can use large buffers but if I start playing something else, or stop the music I still want it to be responsive which may require remixing.