Hacker News new | past | comments | ask | show | jobs | submit login
Variable duty cycle square waves with the Web Audio API (danblack.co)
47 points by iamdan 16 hours ago | hide | past | favorite | 61 comments





To me, the Waveshaper 50% duty cycle sounds strikingly like the Fourier 25% duty cycle.

Is this something to expect or is there an error in the code?


for that case I guess AudioWorklet would be a more recommanded way to experiment

you might also be interested in Glicol (https://glicol.org/)

especially this example: https://glicol.org/tour#meta2

it's rhai in rust -> wasm -> sab -> audioworklet


As others noted, merely switching back and forth between -1 and 1 will result in heavy aliasing. Also, since you can only switch from -1 to 1 at integer sample points, you won't be able to accurately generate frequencies that don't divide the sample rate evenly. E.g. if your sample rate is 48 kHz, you won't be able to generate a 50% duty cycle 11 kHz square wave. Either the duty cycle, or the frequency will be different. Like, audibly different.

The proper way is either via the Fourier series; or look into BLIT [1] synthesis.

[1] https://ccrma.stanford.edu/~stilti/papers/blit.pdf


Aliasing seems like a potentially desirable feature for a chiptune / retro sound, though.

This is the nice thing about constraining myself to a rather old and crude sound style, the rough edges can remain a little rough

Blargg's Blip Buffer library is a widely-used implementation (especially in chip music synthesis), explained in detail here: https://www.slack.net/~ant/bl-synth/

The given examples seem disconnected from the duty cycle radios on my browser. I mean changing them changes the sound, but it's as if the one I selected is not the one playing.

Anyway, what could have been the purpose for the gameboy hardware to provide both 25% and 75% duty cycle? In audio, these sound identical to a human, no? They are the same waveform with inverted polarity. They have the same overtone content.


> The given examples seem disconnected from the duty cycle radios on my browser. I mean changing them changes the sound, but it's as if the one I selected is not the one playing.

Confirmed. Author, please fix! For example, in the first set of radio buttons:

1. Leave it at the 50% default

2. Press play

3. Change it to the 12.5% option -- we continue to hear the same sound

4. Change it back to 50% -- finally we hear a different sound

This is broken. Another example:

1. Listen to 12.5% after having come directly from 25%

2. Listen to 12.5% after having come directly from 50%

The 12.5% should sound identical in either case, but it erroneously does not.


This is correct, the demo not properly handling the selection of a new duty cycle. I pushed up a correction just now. Thanks for laying out a detailed replication - made for an easy fix

I am little confused by the article because it sounds like they are describing "pulse width" which is a common parameter on analog and digital synthesizers to change the character of the square wave. A square wave with a low pulse width will sound thinner than one with a high pulse width, and layering square waves with different pulse widths gives you a pleasant phasing effect.

Based on some cursory research, however, it seems that duty cycle is different than pulse width, so now I am unsure if they are trying to use duty cycle variation to implement pulse width modulation (PWM) or if they are doing something else entirely.


More precisely "pulse width" would be a time, while "duty cycle" would be a percent.

And while when going from 0% to 50% duty cycle it could be said that "a square wave with a low pulse width will sound thinner than one with a high pulse width", however, once you go past 50% duty cycle the situation reverses. So a 25% duty cycle would sound almost identical to a 75% duty cycle...the amplitudes of their Fourier transform components would be identical.


> almost identical ... components would be identical

I'm having a tough time reconciling how the former could be almost identical while the latter is identical. I guess the former involves a human listening through a speaker which has asymmetric imperfections (maybe the speaker moves outward more easily than it moves inward, or a DC offset in the signal leads to compression in the high-excursion side that doesn't exist on the low-excursion side, etc.) whereas the FFT readout doesn't necessarily have a speaker in the system at all.


Different linearity properties on the positive and negative side would be pretty bad for a speaker, but possible. In the case of a square wave, non-linearity would be identical to a fixed amplitude change though, possibly with a DC bias.

Based on the gameboy wiki I looked up, the phase of the 25% duty and 75% duty are such that they are inverse of each other, seemingly eliminating the possibility of combining the two for different waveforms.


25% and 75% would sound identical alone, but in a mix there often are interplays where it can create a difference. An easy way to hear it is to run two synced oscillators, say a square and a saw, with sharp attack. The resulting sound should be sufficiently different, one side would dampen the attack compared to the other. Furthermore, I think in hardware synths and those that emulate them changing pulse width can cause the module to implicitly shift the signal up or down to ensure consistent average voltage, further complicating things. I am curious what you mean by compression.

Good point. If I have 2 oscillators, and no control over their phase as they mix, then an option to choose 25% vs 75% for one of them would at least offer some variation instead of none.

As for compression, this [0] is a good intro. Most commonly it is applied to a signal deliberately to achieve a desired outcome, but I'm referring to a (generally) undesired speaker nonlinearity [1] near its maximum power handling capacity.

[0] https://en.wikipedia.org/wiki/Dynamic_range_compression

[1] https://marshallforum.com/threads/what-exactly-does-speaker-...


Thanks! I’m very familiar with the first one, but never thought of the second one actually.

They are synonyms in this context

If you mix a square wave of one duty cycle with another of a different duty cycle, they partially cancel each other out and you get a new sound.

I'm not sure, but I believe the original NES Castlevania does this in some places, like in the "you died" jingle. (It's possible I'm misremembering and it's simply two square notes separated by an octave.)


This should hopefully be fixed now, there was an issue with handling the radio button selection change.

I haven't been able to find anything about why exactly they chose to provide both 25% and 75% DCs - they do sound the same minus the inverted polarity like you mentioned.


To my ears it sounded like 25% and 75% duty cycles were 50%, and 50% sounded like a shorter one, but not sure.

Yes, that's how it sounded to me as well.

I had a bug in my demo code around handling the duty cycle change, this should be fixed now

>In audio, this sound identical to a human, no?

If it was just that single wave, but there is more than 1 audio channel.


Adding more channels does nothing to the overtone information. From what I can find[1], it seems that the phase of the 25% and 75% waves are such that the two waves are actually inverse of each other. I don't know much about Gameboy hardware though. Do you actually know what the point of this is?

[1]: https://gbdev.gg8.se/wiki/articles/Gameboy_sound_hardware#Sq...


Sure, but having a channel with 25% duty cycle of frequency f and a channel with 75% duty cycle of frequency 3f will lead to a different waveform after mixing compared to 75% at f and 25% at 3f, no?

Depending on how that's processed downstream it could sound very different I imagine.

In crude ASCII art (two inputs mixed to an output):

  ---_________
  
  ---_---_---_
  
  
  ¨¨¨_---_---_
vs

  --------____
  
  -___-___-___
  
  
  ¨---¨----___

I saw a video a few years that provided pretty convincing experimental evidence that's not actually the case.

https://www.youtube.com/watch?v=Ffka-hPzug0


Very interesting, though any non-linear effects would definitely get affected.

Not sure how much that comes into play on the Gameboy though.


If you play 2 inverse waves it will sound different than 2 of the same wave.

I'm also working on a gameboy inspired music app! (for Android, in Rust)

If you use a perfect square wave, the aliasing is extremely audible and sounds terrible.

As far as I can tell, in native everyone uses a nifty algorithm called bank-limited audio synthesis, and specifically blargg's implementation blip_buf.

https://slack.net/~ant/bl-synth/

If I were OP I'd try to compile the rust port of this library to WASM.


That's sweet! I'm trying to see how far I can push the web audio api but yeah the aliasing can be harsh. I think there are also some things I can do with filter nodes to smooth things out a bit

The aliased harmonics are folded back and mixed with the true harmonic content of the square wave. You can't filter them out without significantly affecting the square wave sounds.

There is ome thing that I don't understand here: wouldn't implementing a custom square wave generator as an AudioWorkletProcessor be a more straightforward approach in this case?

I think long term I will be moving in that direction. For this initial exploration and experimentation I've been doing with the Web Audio API, the sawtooth oscillator + step function waveshaper node have been sufficient for my use case but I will need investigate whether I can produce a more authentic sound with an AudioWorkletProcessor approach

I did a little synth project recently that uses an AudioWorklet processor to morph between single-cycle waveforms, and it worked super well. When I tried to do this with the Web Audio API, the audio would stutter when I moved the controls. Switching to an AudioWorklet thread eliminated the stuttering issue. So, if you need real-time sound shaping controls, you may find that AudioWorklet is a better fit.

https://waves.tashian.com


I’ve got a very simple project demonstrating how to use the newish pure C++ Emscripten Audio Worklet API if you’re interested. It’s a bit neater than the old way you’ll usually come across which also involves writing JS code, but there aren’t many docs online! https://github.com/tomduncalf/emscripten-audio-worklet-examp...

Yes... IMO, the WebAudio synthesis primitives are not particularly useful. You are so much better off discarding these primitives and rolling your own for an audio project of any significance.

I agree. There’s a bit of mental overhead related to working with audio worklets (you have to load them from an URL or a blob URL, etc.) but for square waves in particular the logic should be fairly straighforward, the process function just needs to output 1 or -1 for any given sample.

It needs to do a lot more than that if you want to have something that doesn't alias like crazy. And significantly more if you want it to sound like a game boy.

Well, the output of the wave shaper approach in the article is exactly the same sharp digital square wave that you'd get from a trivial square wave generator.

But I overlooked the point that the GP mentions that the processor source code must be loaded from a separate JS file. That's some quite annoying overhead.


What I was trying to imply was that it's not enough to build a square wave generator to sound like a game boy. Even if that generator in the game boy was a perfect square wave generator, which I'm not sure it is.

The sound created by that generator passes through a fairly complicated filter known as The game boy's speaker. To properly create a game boy sound, you need to find or take an impulse response of that speaker and convolve it with the output of your oscillator.


Well, and a single recorded impulse response is probably not enough, either, if you want to be super accurate. The directivity pattern is probably awful and shaping the sound as well.

I know all of that. But I just didn't want to get into these details in this thread.


Shot of the output in time and frequency domains going from samples 1 through 3 from left to right. Second one is funny.

https://imgur.com/a/PY942sy

(Chrome on mac)


Cool post! I am also making an 8-bit-like tracker with web audio (and my name is also Dan B. Weird), but I never found this sort of solution to make it work with the native nodes. I just wound up manually creating wav buffers. Gives you the ability to emulate a lower sample rate and lower pitch resolution (8 bits on NES, I believe), plus you can work around the weird scheduling quirks of the WebAudio api that didn’t really jive with my real time game audio use case. It’s live at deathbit.okay.tools if you wanted to check it out.

Great name. That's super cool i like the retro aesthetic. The scheduling piece of trackers built with the Web Audio API is definitely a bit of a struggle but so far it's been manageable

I also demonstrate square waves synthesized with the Fourier approach, along with a duty cycle parameter: https://www.nayuki.io/page/band-limited-square-waves

Web Audio is awesome and IMHO people who got the spec out and pushed for implementation deserve lots of respect. You want to make a tracker that runs in your browser and you have all the primitives at your disposal to make it work? A few years ago I wouldn’t imagine it can be possible.

Looking forward to seeing the tracker some time!


I have a hotter take, that the spec and implementations were rushed and it makes it hard to push past "toy" quality for web apps. I have seen people struggle and wind up moving mountains to get what is basic functionality and performance in native audio apps.

The other thing is that they have made cardinal sins like relying on direct form biquads for basic filters and using an array for parameters. It's good enough to make a demo but falls apart in the situations that you actually care about, and these are things that the pro audio industry (or similar, like gaming) have had solved for a very long time (*)

* pro audio software is a disaster, but for other reasons


It's super cool! Just from short time that I've been playing around with it, it looks like you can push it pretty far. Before too long I hope to have a Show HN post with a link to the tracker out live in production

I was curious to see how audible square wave duty cycle would be (I figured not much) but sadly the audio examples are clearly broken on this page--choosing the same one multiple times gives totally different sounds.

Changing the duty cycle of a square wave is called Pulse Width Modulation and is an extremely audible and iconic sound. If you have ever heared music any on the Commodore 64, you will familiar with its sound. PWM is also available in many professional synthesizers of the same era.

Yes! My dad has an old Commodore 64 and I vaguely remember a skiing game. I think it's so cool what folks were able to accomplish musically on such limited systems at the time

Thank you for calling that out, I am aware that the audio demos are a little busted right now and I'll be patching them up when I get a little time

That waveshaper version is going to alias pretty badly, isn't it? Did the original Gameboy sounds have that aliasing?

I'm not an expert on GB chiptune, but from what I've heard from enthusiasts is that different GB models sound different, and even within the same model there are variations. That said, it wouldn't surprise me if the GB waveforms aliasing, at least from the digital side, given that it was operating with pretty minimal synthesis capabilities. There's probably some extra low- and high-pass filtering that shape the sound after the waveform itself is generated. Looking at some examples of actual GB PWM waveforms, for sure some high-pass would make a pure PWM waveform more GB-like. And some low-pass could help a bit with aliasing.

it does, you definitely "feel" the change. I actually have to do some digging to figure that out, I unfortunately do not have my childhood GBA anymore so I have to rely on audio clips to make that call

Don't forget to factor in the Gameboy speaker in your listening of those audio clips. That's a major factor that will change how these waveforms sound very significantly. Those clips, and the classic sounds of chiptunes, are never "just" the sound coming off the DAC.

"Variable duty cycle?"

So just PWM?

Also, if you needed to exceed 50%, you could have just combined two different square waves, out of phase, and you'd be there.


> if you needed to exceed 50%, you could have just combined two different square waves

I don't think this is possible. A balanced square wave has no even harmonics in the frequency domain. Anytime the duty cycle is not 0%, 50%, or 100%, you will have non-zero even harmonics.

A linear combination (scaled sum or difference) of two balanced square waves will necessarily still have all even harmonics at zero, and thus cannot emulate a square wave with a duty cycle different from 50%.


> Anytime the duty cycle is not 0%, 50%, or 100%, you will have non-zero even harmonics.

The article is specifically trying to achieve those harmonics. This is the initial problem.

> A linear combination (scaled sum or difference) of two balanced square waves will necessarily still have all even harmonics at zero

Yea, that's why you need the phase offset, as I mentioned.

> and thus cannot emulate a square wave with a duty cycle different from 50%.

You can /synthesize/ a square wave at any duty cycle you like. We're still doing Fourier just without the whole transform.


That's true, I'm coming into the audio and signals world as a beginner so I'm still picking up the lingo but totally! For my purposes I only want to keep track of a single source for a channel so I've gone down this wave transformation path rather than composition but that also would get the job done

Yeah, at least in the context of music synthesis, everyone just says PWM. (The term even being subject of a meme with a certain youtuber who seems to be fond of it ;)).



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: