Hacker News new | past | comments | ask | show | jobs | submit login

Isn't this precisely what some ad tracking people or rights management are doing? Aren't they adding audio that is out of range for human consumption but things like Alexa can hear them so that, or so the broadcaster can tell that a pub is broadcasting a match without paying for it, etc?



This kind of modulating data in infrasound or ultrasound is common, yes. It has been used in toys too, e.g. things that respond to certain sounds from a show. chibi-tech stuck some reverse engineered ultrasound triggers for a certain line of toys in some of her songs :)

https://twitter.com/chibitech/status/1237326756672983040

Infrasound only works digitally because no speaker system can reproduce frequencies that low, and many analog systems will corrupt them (e.g. AC coupling). Ultrasound is therefore used most of the time in practice, but I believe infrasound has been used in digital song watermarking for DRM/copyright tracking purposes.


Not really, unless you are doing a 1:1 digital copy, frequencies outside the range of human hearing are often removed.

Frequencies outside the range of a speaker are often filtered out as it can create distortion or even damage. And the job of lossy compression is to remove everything that you can't hear in order to save bytes, and limiting the bandwidth to what you can hear is the most basic step.

Instead, DRM systems typically encode data over a wide range of frequencies (spread spectrum), well within the audible range. It is designed in such a way that you could hear it in theory, but don't notice it because it blends with background noise. It is very robust, resisting compression, recording and even deliberate attacks. In fact, it is one of the techniques used by the military radios to resist jamming.


You can also trigger wake words like "Ok Google" or "Alexa" by using _harmonics_ of normal human voices that are outside the range of audible sound. The key is that the mic can't differentiate between the harmonics and actual speech of normal human pitch and so the trigger is set off, but the sound isn't audible to humans.

I don't have a link to the paper handy (sorry!) but IIRC I found a white paper on ArXiv called "Dolphin Attack" or similar that demonstrated this. It was a fun read.


While technically that sounds like a cool hack, it sounds like a dumb thing to actually allow to happen. "Okay Google" or "Hey Alexa" outside of the frequencies reproducible by human voices should be filtered out completely. At least, from my comfy chair nit picking someone else's work. Of course, by not filtering the acceptable bands allows them to do all of the ad tracking/rights management things they are allowing to occur. The fact that Alexa/Google is able to do these ad tracking/rights management is just another example showing that the mic is listening 100% of the time.


Last I looked, they used non-linear mixing in the analog circuity to down-convert ultrasonic audio into what looks like human speech to the devices.


Do you have a link? I assumed a low pass filter was applied to the signal to reduce unnecessary data transmission. I have heard about what you're saying and I'm not dismissing it. One could run their television through a low pass filter to get around such tracking, right?


That's kind of what I was implying with one of my other replies. This filtering is something that Alexa/Googs should be doing on thier end. Their mics should only be listening in the frequency ranges of what human voices exist.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: