Aphex Twin had hidden spectrogram images in some of his tracks in the past, unfortunately I think they are not well preserved by multiple steps of compression/reencoding to youtube videos.
This turns an image into a PCM audio file. The image is visible on the spectrogram. For example, here's Ernest Hemmingway punting a beer can: http://imgur.com/QR5a8mw
+1 stb_image.h it uses a good interface for an image loading library, something nobody seems capable of even in the "good ol' days"... just a shame it doesn't support more of PNG (16-bit channels would be nice, and support for bigger images too)
using libpng, libjpeg etc. however is a massive pain with lots of work required even if you just want to do what everyone does pretty much and load a file into RGBA buffer and get back the width and height. :)
Yep. The stb_image.h API is very simple. Also, because its just an #include, you don't need to worry about your users needing to figure out how to link a library.
I got some very interesting results from feeding fractal images into a program like this a few years back - unfortunately, I don't have the resulting sounds, but you pick less busy images with filaments, and adjust the contrast - the result is very organic. Must have a go with this ...
If it's just through a cheap speaker and mic, you'll probably lose a lot of the image.
This uses a linear frequency scale (which is just the nature of Fourier transforms), whereas our ears are sensitive on a log frequency scale. In other words, the information that's most important to our hearing, which is what a mic & speaker will preserve the best, is in the bottom 10% of the image.
A cheap speaker & mic will probably lose a lot of content above about 10 kHz - which is the entire top half of the image. Even though this wouldn't be that huge a difference to our ears, it would sure look bad in the image.
As for background noise, the difference would probably look like the difference here: http://www.sweetwater.com/insync/media/2010/09/RXAdv-e-xlarg... (that's a screenshot of audio restoration software that removes noise, so it's technically doing the opposite process as best it can, but the difference would be similar).
This isn't exactly related (as it doesn't produce a spectrogram), but the software pixivisor plays around with this idea: http://warmplace.ru/soft/pixivisor/
The software can act as a transmitter or receiver. In trasmitter mode, you can provide it a static image or animated gif, which it will convert into audio which plays continuously. In receiver mode, pixivisor listens via the mic or line-in (depending on hardware platform and whats attached) and reconstructs the image from the audio. You can then manipulate the audio however you want.
I would expect to see the frequency response of the speaker + mic combination show up as darker and lighter horizontal bands in the image (since these spectrograms are plotted with frequency on the y axis). It might look really cool!
Edit: depending on the time alignment / phase response of the speakers, you might see the low frequency parts of the image get distorted to the right.
https://www.google.de/search?q=Aphex+Twin+Spectrograph&tbm=i...
https://youtu.be/M9xMuPWAZW8?t=5m31s