When I was 12 or 13 I wrote a program that transmitted data over speaker and microphone. It would generate a 90ms tone for each byte and a 10ms silence.
It wasn't elegant, and most of the code was stolen from Planet Source Code, but it worked mostly.
For the most part like a less dynamic modem. Less EEEE-OOOO-SCRSHHHH and more EEE-EEE-EEE it had a base tone of something middle-c like (can't recall, it was 20 years ago now) add just added the byte value with a modifier on top of the base. So for long stretches of it it was just to my ear pretty much the same sound, especially when transmitting ascii text.
It wasn't elegant, and most of the code was stolen from Planet Source Code, but it worked mostly.