Hacker News new | past | comments | ask | show | jobs | submit login
Encoding Data in Dubstep Drops (benjojo.co.uk)
179 points by albertzeyer on Jan 15, 2021 | hide | past | favorite | 90 comments



Reminds me of Aphex Twin (and some other artists) embedding images in the spectrograph render of their songs:

https://www.magneticmag.com/2012/08/the-aphex-face-visualizi...


Along the same lines, these crazy fools are making oscilloscope music, where the song itself is the visual.

https://youtu.be/XziuEdpVUe0


"... and CTF challenge authors love to encode text into audio waveforms, which you can see using the spectogram view (although a specialized tool called Sonic Visualiser is better for this task in particular)."

https://raw.githubusercontent.com/trailofbits/ctf/master/doc...


Stenography is common, but usually it's just modifying the last bits of an image or hiding the extra payload after the image data. This is "drawing" within the audio spectrum, making the image out of audio that will be audible.


Steganography. Stenography is something else. :)


This gets me every. Time. Lol


Was this hidden in the audio or just some garbage noise?


The "Aphex face" manifests as an clearly audible weird abstract sound at the end of the track. It sits nicely in the mix, but it's also distinct from the actual music - I would say it comes across like a signature rather than a musical element. The other examples in GP's linked article made more effort to produce things that are both musical and paintings in the spectrum, especially the Plaid track.


A DC offset, which amplitude-shift keying like this introduces, isn't so nice to your speakers. It might cause them to heat up as the offset waveform holds the magnet out in one direction. I do love the idea of hiding data in "messy" audio, though :)


In a traditional amplifier the DC should be blocked by the DC blocking capacitors, but I guess with modern full-bridge class-D amplifiers which don't require those it'll have to be done on the software/digital side?


A switch mode amp is still likely to have the blocking caps.


I'm no expert at all, I just saw that the DAC[1] I picked up for a project specifically states no blocking caps are needed.

I then found upon this application note[2] which states a full-bridge class D shares the benefits of a traditional bridge tied load[3] amplifier, like not requiring DC blocking capacitors.

Not sure how common this is, but clearly blocking caps is not something to be assumed.

[1]: https://www.ti.com/product/PCM5102A#product-details##feature...

[2]: https://www.maximintegrated.com/en/design/technical-document...

[3]: https://en.wikipedia.org/wiki/Bridged_and_paralleled_amplifi...


Ah, that's a good observation. Looking at the datasheet for that TI chip, it is powered by + 3.3 V, but contains an internal charge pump that generates a - 3.3 V supply. So the output can swing across ground. That's what eliminates the need for blocking caps. And it specifies a DC offset error of +/- 1 millivolt. So they've designed around the need for blocking caps with internal circuitry and a guaranteed maximum output offset.

Now what happens if there is an input offset, such as a constant stream of input words other than zero? That's where the system designer has to take care of things. And there are offsets even when blocking caps are used. The caps only prevent compounding those offsets from one stage to the next.

And a certain amount of offset won't kill your speakers. It just has to be kept within reasonable bounds.

I've noticed when using the audio input of a PC for measurement, that there is a constant offset, which is just an artifact of the ADC. The audio recording software has a function for removing that offset. In fact I would consider an audio recording with a large residual offset to be a poor engineering practice.

Often, switchmode amplifiers with BTL output have a large normal mode offset, so they can be powered by a single supply. And single-supply audio circuits tend to be festooned with blocking caps. ;-)


If it's audio rate how is it different than a squarewave?

Edit: looked closer at the post, the DC offset lasts for over 100ms per cycle.. yeah that's a problem


All waves are made of sine waves. DC offset is when the signal is too much one way or another.


Maybe in analog, but in digital, they are squares and what not type of waves.


Sigh... No, digital signals are not “square waves”. https://youtu.be/cIQ9IXSUzuM


I always laugh when vinyl is described as more “pure”. There’s nothing more pure than math, and that’s digital PCM audio. Sure, it’s stored as discreet samples, but that’s not how it comes out of speakers. The digital->analog converter will give you 1:1 perfect representation of the original waveform as long as you sample at 2x the highest frequency desired and there’s a low-pass filter in place.


For the record, I don't think vinyl enthusiasts ever describe vinyl audio as more pure. "Pure analog," yes, but that's different (and true). It's generally acknowledged by vinyl enthusiasts and audiophiles that vinyl introduces a lot of imperfections, which some people prefer.

Also worth noting that an ideal D/A converter will give you the exact waveform back, but such a device does not exist (but you can get pretty close).


Not strictly true, you must sample at 2x the frequency and at sufficient resolution in the amplitude domain, i.e. an ADC that samples at 44 kHz but with only one bit of resolution (outputs a 1 for positive input voltage and 0 for negative, say) would be pretty awful...


On the other hand a sigma-delta ADC that samples at 2.8 Mhz with only one bit resolution is pretty good (for audio purposes).

With sigma-delta modulation you can trade sample rate for bit depth and vice versa.


Have you ever played with an Arduino or similar device? Comparing inputs signals on a digital pin vs an analog pin? Hopefully, you'll agree it's the same concept. If you haven't, I'd encourage you to try one out. They are loads of fun. I am a huge fan of analog, yet digital is just so damn convenient. If you have played with one, you'll understand why your comment makes me smile and chuckle.


A digital pin will most likely output a signal using a zero-order hold, which is the simplest type of reconstruction filter.

A zero-order hold is just one possible way of turning a digital signal into an analog one which you can then measure with your oscilloscope. But the output of the zero-order hold is not the digital signal, because the digital signal is only defined at discrete sample points.


Your comment might be more helpful (i.e. more likely to be read by others who could use the information you're providing) without the snark.


Sigh... you are wanting to show me that we can do A/D and D/A again? Thanks, I was totally unawares that we could do that. I've never heard an audio signal played back once it was digitized. My life is now complete.

What this guy is showing is not a digital signal. It is an analog signal that has been generated from digital data. Not sure what the point of all of this was, but thanks, I needed a break from finding this bug I've been trying to squash.


I recommend watching the video again, all the way through -- no skipping. It's pretty great.


The problem is not the waveform, it's the DC offset requiring more constant current to the voice coil.


Aren't all speakers AC coupled? Should be filtered out by somthing as simple as a blocking capacitor no?


Not necessarily. The amplifiers sometimes are though.

In a normal 2 or 3 way speaker cabinet, you'll have an analog crossover which consists of something like a capacitor in series with the tweeter (high pass filter), and an inductor in series with the woofer (low pass filter).

In that case, the tweeter is protected from DC, but the woofer isn't.


Very cool thank you!


Yes, it's unlikely to cause problems in any real-world setup, but it's theoretically possible :P I've edited my original comment to clarify that it won't always be the case.


Signal that can damage hardware is something that used to be discussed with CRT monitors back in those ancient times, and rumours existed of a virus that played on this.

Having damaging audio signal is a new one for me.


That reminds me of this curious item from a few years ago: https://news.ycombinator.com/item?id=7205759


Also, if the data rate is low enough, compression will probably also remove such dc components?


Hey cool! Here's essentially the same encoder in a few lines of JS, you can run this on https://wavtool.com by pressing cmd+;

  (() => {
    const message = 'asdf';
    const messageBinary = message.split('').map(c => c.charCodeAt(0).toString(2)).join('').split('').map(Number);
    const bitDurationSeconds = 0.1;
    const shiftSize = 0.1;
    return wavtool.mapSamplesCommand((sample, index, channelData, settings, context) => {
      const bitIndex = Math.floor(index / (bitDurationSeconds * context.sampleRate));
      const shift = bitIndex < messageBinary.length 
        ? (2 * (messageBinary[bitIndex] - 0.5)) // [0,1] => [-1,1]
        : 0;
      return sample + shift * shiftSize;
    }); 
  })()


You legend. Thanks.


I was hoping that this would be encoding the data in the audible, data-sounding part of the sound.

The technique described could be used on any bass-heavy music, and is in no way related to dubstep or its data-sounding-ness.


Whatever that music is, it sure ain't dubstep.

I wonder how well it'd work with something like this https://www.youtube.com/watch?v=VEAf_ZztCP0


Skrillex is actually regarded as a leader in Brostep, a subgenre and/or style of Dubstep. At least in the USA, Brostep has practically supplanted Dubstep and co-opted the name. This has been occuring for about the past 10 years. A big factor is the "drop" part of Brostep sounding very attractive to non-EDM people who head-bang, and to whom traditional Dubstep would be percieved as "boring" or not stimulating enough (without intoxication). This has led to a re-enforcing cycle where domestic EDM festivals reach greater audiences, and so they keep promoting Brostep as Dubstep.

Personally, I'm not a fan of this shift. I prefer EDM from the era of the track you linked to, and/or contemporary artists who emulate the older styles.


Oh I know, I just get irrationaly annoyed by people calling this stuff dubstep. It feels like half the problem is people and/or clubs don't have soundsystems appropriate for playing bass-heavy music at the right level, so they end up listening to stuff that's light on the low-end but still calling it "dubstep".


You are probably right. For EDM, cultural differences have kept the US lagging behind Europe and the UK. EDM-specific clubs are not common outside of major cities known for nightlife. A lot of EDM tours end up at venues that aren't designed for EDM. These are the conditions informing people's tastes, so probably a big factor for why bass isn't as prominent as elsewhere. There truly are fans an places where it is very much alive, just not as strong as other places IMO.


Brostep may be Dubstep Disneyland, but I'm not sure if being unconcerned with filing music in the right subcategory counts as "lagging".


The bass music underground is alive and well in many parts of the US.

Mainstream it is not, and THANK GOD for that


Pretty much every genre changes significantly over its lifetime though. It seems like you could just say you like early dubstep just like people say, "I like old school hip hop" all the time.


I get what you’re saying but the sound is so completely different it’s not a simple evolution of style.

Listen to dnb from two decades ago you can still see where current stuff comes from. Compare early Digital Mystikz stuff with Skrillex and you wouldn’t call them the same thing.


This still gets played on a weekly basis in our house https://youtu.be/qwCr9QRNMc4

To save a click... It’s the garage track largely credited as the birth of dubstep


"Zed Bias - Neighbourhood" is actually regarded as one of the first dubstep songs, or at least a transition one.

Here are some other "true" dubstep tracks, in no particular order:

- Burial - Archangel https://www.youtube.com/watch?v=3J1gvgwHblI

- Rusko - Jahova https://www.youtube.com/watch?v=1OE_jjJkkD8

- Coleco - Taostic https://www.youtube.com/watch?v=krGadL6Je6A

- Kode9 - 9 Samurai https://www.youtube.com/watch?v=1-rEAe4C8gk

- Skream - Mignight Request Line https://www.youtube.com/watch?v=vJGXRQ9vBoU


I'd make a case for El-B, too:

Express: https://youtu.be/SLbXmPvtZXA

A lot of proto-wobble in this one.


Nice list. Added all to my Youtube playlist. Have also heard Sjream - Mignight Request Line mentioned as the first dubstep tune.


I’d say Rusko is responsible for the whole brostep thing. You can see how it all grew from his tunes.


Yeah, Woo Boost was a divergence point for the genre for sure.


It's still very garage/2steppy. I think Anti War Dub by Digital Mystikz is generally considered the birth of dubstep. Of course, ask 10 people and you'll get 10 answers.


Midnight Request Line is the one I hear most frequently.


Midnight Request Line was the first dubstep song to go mainstream (IIRC Annie Mac picked it up), but the genre existed before that point. It's a great song though.


Annie Mac made dubstep go mainstream? TIL.

Side anecdote... My oldest bro was in a band with her brother Davey in school and my second oldest bro was in her class.


Oh I know, I’ve got the first few DMZ releases on vinyl but that seems like the first track that started to hit public consciousness.


El-B's "Ghost Rider" is often credited as the first dubstep track too.


Listening to the Skrillex samples in the article, it just sounds like DnB but... obnoxious.

The Loefah track in the parent comment though, I can get behind that.


It's even more obnoxious than breakcore, which is saying a lot.


Same, I did not know Skrillex but I dislike this style.

On the other hand I love Fonik, for example.

I love even more Deadmau5, E.T.H, ...


Skrillex ruined dubstep in the 'states. And he looks like a total d-bag.

I hate that guy.


I know the point of the article has nothing to do with the definition of dubstep so it took a lot of restraint to not write something about it ha, difficult when you grew up listening to the early stuff.

I'll take this as my only chance I'll probably get to post dubstep on HN in a valid discussion:

https://www.youtube.com/watch?v=rc85cGTlKLY

https://www.youtube.com/watch?v=dwva123XBMk

https://www.youtube.com/watch?v=l2n7w1H0pRQ


The James Blake remix of Changes is well worth checking out if you like the original!


I have it on vinyl :)


Yeah, this is the American "Dubstep" commonly referred to as "Brostep". Damn I miss the real Dubstep sound!


Still some bangers being released on Sentry records, Deep Medi, Bandulu and a bunch of other labels.


This fell out of style for the most part in America 5+ years ago, FYI.


Can confirm this is no longer the new thing, but FWIW, jazz music similarly fell out of style ages ago, but that doesn't make new stuff in the genre uninteresting to those who enjoy it irrespective of hype value.

That is, it's no longer nifty and fashionable to listen to this type of "Brostep", but people who liked it without regard to its social status may continue listening to new material as though nothing changed, while others may have grown tired of the sound or the social clout it may have brought them to be "in the know" or part of some zeitgeist and simply kept up with "today's hits".

The same is true of lots of genres IMHO.


Sorry, this what the word means in the mainstream. When you say 'punk' people think of blink-182 rather than the clash, there's nothing you can do about that either. Maybe someone should have thought up a name slightly less stupid than brostep


Indeed. The fact that the article writer is from the UK makes that lapse inexcusable.


Big shout out to the ideas like this that are just someone having fun and being excited enough about the outcome to share with the world :)


Ben's blog is full of this stuff, it's fantastic, definitely my favorite 'doing fun stuff with tech' blogger of recent years.

https://blog.benjojo.co.uk/post/dns-filesystem-true-cloud-st... is one of my favorites.


a "Big shout out" is very appropriate for a post about dubstep


The way the author went about doing and explaining this is somewhat confusing. What he arrived through that strange band split/merge process is actually ~identical* to:

- Apply an EQ filter that lowers the volume of the 0-100Hz band by 6dB (this happened because he halved the amplitude of the 100Hz band)

- Add a slow binary digital signal at around ~4 baud (2Hz fundamental), with slopes smoothed to around 10% of the bit time.

This has nothing to do with dubstep or bass drops - it would work for any song. It's just modulating data in infrasound, at 2Hz, which is well below the threshold of human hearing. The problem here is that he's also needlessly reducing the level of the 0-100Hz band to half the amplitude (6dB), which completely kills the bass feel of the original song. Dubstep fans will not approve (and he needs better speakers if he can't hear the difference).

A much simpler, more sensible process would be to just do this:

- Apply a steep highpass filter at 20Hz (the limit of human hearing), to remove any inaudible low-frequency (infra)sounds.

- Reduce the volume of the overall song by, say, around 1dB, to make a bit of headroom for the modulated digital signal

- Encode whatever you want in those 20Hz in the headroom you created (the amplitude can be quite low, e.g. 5%, it doesn't need to move the whole waveform over).

Then to decode it just lowpass the signal at 20Hz and do your bit detection after that - the filter will remove the audio, leaving only your signal, so it doesn't matter that your signal isn't a whole 50% of the output power. Now the song is only 1dB quieter. You can use as simple or as fancy a modulation technique as you want in that 20Hz band. You could use (normal) ASK as he did, just lowpass it to remove any high frequency components. You could use FSK. You could use QAM. Whatever.

* His process actually also messes up the original 100Hz band by modulating it with a ~4Hz square waveform due to the way he does the modulation by inverting and interpolating, which is going to create harmonics and other ickiness around the transitions, as well as does not guarantee the absence of clipping due to the way he only reduced the amplitude of the low 100Hz band (this process can actually increase peak levels, as can happen any time you use frequency filtering - try his high-pass filter command on this file and watch sox complain of clipping, even though the original file does not clip: https://mrcn.st/t/filtering_clips.wav ), so I would not recommend trying to emulate his approach precisely even if you want to achieve the same actual effect, since it's actually quite a silly way of going about doing it :)


It's not even dubstep/brostep related. It's also very inefficient.

The best picture/audio tool is probably Metasynth - as used by Aphex Twin. It's been around since the 90s and seems to be in development limbo at the moment, but with a bit of taming it will make all those classic dubstep/brostep sounds out of carefully selected images. Which can be graphic images of text.

https://uisoftware.com/metasynth/


Isn't this precisely what some ad tracking people or rights management are doing? Aren't they adding audio that is out of range for human consumption but things like Alexa can hear them so that, or so the broadcaster can tell that a pub is broadcasting a match without paying for it, etc?


This kind of modulating data in infrasound or ultrasound is common, yes. It has been used in toys too, e.g. things that respond to certain sounds from a show. chibi-tech stuck some reverse engineered ultrasound triggers for a certain line of toys in some of her songs :)

https://twitter.com/chibitech/status/1237326756672983040

Infrasound only works digitally because no speaker system can reproduce frequencies that low, and many analog systems will corrupt them (e.g. AC coupling). Ultrasound is therefore used most of the time in practice, but I believe infrasound has been used in digital song watermarking for DRM/copyright tracking purposes.


Not really, unless you are doing a 1:1 digital copy, frequencies outside the range of human hearing are often removed.

Frequencies outside the range of a speaker are often filtered out as it can create distortion or even damage. And the job of lossy compression is to remove everything that you can't hear in order to save bytes, and limiting the bandwidth to what you can hear is the most basic step.

Instead, DRM systems typically encode data over a wide range of frequencies (spread spectrum), well within the audible range. It is designed in such a way that you could hear it in theory, but don't notice it because it blends with background noise. It is very robust, resisting compression, recording and even deliberate attacks. In fact, it is one of the techniques used by the military radios to resist jamming.


You can also trigger wake words like "Ok Google" or "Alexa" by using _harmonics_ of normal human voices that are outside the range of audible sound. The key is that the mic can't differentiate between the harmonics and actual speech of normal human pitch and so the trigger is set off, but the sound isn't audible to humans.

I don't have a link to the paper handy (sorry!) but IIRC I found a white paper on ArXiv called "Dolphin Attack" or similar that demonstrated this. It was a fun read.


While technically that sounds like a cool hack, it sounds like a dumb thing to actually allow to happen. "Okay Google" or "Hey Alexa" outside of the frequencies reproducible by human voices should be filtered out completely. At least, from my comfy chair nit picking someone else's work. Of course, by not filtering the acceptable bands allows them to do all of the ad tracking/rights management things they are allowing to occur. The fact that Alexa/Google is able to do these ad tracking/rights management is just another example showing that the mic is listening 100% of the time.


Last I looked, they used non-linear mixing in the analog circuity to down-convert ultrasonic audio into what looks like human speech to the devices.


Do you have a link? I assumed a low pass filter was applied to the signal to reduce unnecessary data transmission. I have heard about what you're saying and I'm not dismissing it. One could run their television through a low pass filter to get around such tracking, right?


That's kind of what I was implying with one of my other replies. This filtering is something that Alexa/Googs should be doing on thier end. Their mics should only be listening in the frequency ranges of what human voices exist.


The limit of human hearing is a bit above around 20KhZ. Emphasis on the K. It is totally possible to hear 20Hz noise. In fact 20KhZ is better for this because you can use a higher bit rate and that band is going to have very little intentional noise to begin with.


Wrong side of the spectrum. The lower limit of human hearing is 20Hz. It is not possible to hear <20Hz noise. It is, however, possible to feel it, if the sound pressure is loud enough and you actually have a subwoofer capable of reproducing those frequencies, but that is rather unlikely unless your subwoofer is a cut-out in your room's wall with a fan in it.

https://en.wikipedia.org/wiki/Rotary_woofer

If you feed a 20Hz signal to a typical home subwoofer (or even most club systems) and hear something, you aren't hearing 20Hz. You are hearing a bunch of high frequency rubbing noises as the speaker cone moves at 20Hz, trying and utterly failing to couple any amount of energy at that frequency into the air. This is why many songs these days are produced with "bass maximizers" and why modern laptops can sometimes have "decent bass". It's not bass, it's a filter that purposely distorts the bass, which your speakers can't reproduce, into higher frequencies, which it can and which we've learned to associate with heavy bass played through systems that can't reproduce it but distort instead.

Just for reference, I believe these are the subs we use at Euskal Encounter. I can vouch for the fact that they can make the floor shake in a massive event hall venue. Low end response: down to 28Hz. No more.

https://jblpro.com/en/products/vtx-b18

It is indeed better to modulate data in ultrasound since you have a lot more bandwidth - except for the fact that any lossy compression applied to your file is going to completely destroy your data. This is one thing the author got absolutely right.


Just going to drop this here then, https://www.youtube.com/watch?v=eTNx_SsApu4


This is really cool. Can the author or anybody else tell me how he made those gif sketches with the moving waveforms and equations?


I was thinking it’d be something more like this: http://www.windytan.com/2015/10/pea-whistle-steganography.ht...


Hog’s heaven for me. I like dubstep, and this is good dubstep. I like tutorials, and this is a good tutorial. +1


0-100hz Skrillex is clearly the best Skrillex




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: