ML dilettante here! I wouldn’t discount ML. The nonlinearities are the bread and...

nyanpasu64 · on June 1, 2021

I've heard images are better modeled in DCT space (which isn't based on complex numbers) because it's better at energy compaction than FFT, and also because it doesn't assume that the image is periodic. Also some people think that the FFT is insufficient, even for audio, because it doesn't model time-domain hearing perception. Some people say that wavelets are better at modeling images than purely frequency-domain transforms, because they take spatiality more into account. From what I've heard, wavelets work well for modeling human vision (in fact convolutional neural network input kernels tend to converge to Gabor filters, which I don't know howw they differ from Gabor wavelets) and noise reduction, but have fallen flat for image/video compression codec design.

sillysaurusx · on June 1, 2021

All excellent points, and I think you should DM me on twitter to chat about this more. (I hope you will!)

DCT is on my radar. But there are several serious limitations that I think are overlooked. For example, convolution is no longer a simple component-wise multiplication. This seems, to me, a big deal.

Complex numbers are tricky to model, but I think most people have given up too easily, or haven't been creative enough in how they're modeling them. Some of my (outdated) ideas: https://gist.github.com/shawwn/c6865fccafac5066e1c7bab672781...

In other words, you're probably right, but I'm focused solely on FFTs on the (very low) chance that people have overlooked something that will work well.

nyanpasu64 · on June 1, 2021

Sorry I don't work on neural networks much, and have my plate too full with other projects (and my DSP is a bit rusty?) to hold a conversation on this right now. And I don't use Twitter much either.

Maybe we can talk later? Not sure.

sillysaurusx · on June 1, 2021

No worries :) it was just an offer. It surprised me how much you knew about the domain. Good luck with your projects!

musicale · on June 2, 2021

> The nonlinearities are the bread and butter of modern ML models

I guess I didn't get my point across. What I meant was that pedal settings tend to be non-linear with multiple sweet spots (which often depend on the guitar and amp) so you shouldn't just do a linear range from 1-10^N (where N is the number of knobs) for training data, as someone else had suggested. Moreover, there are also dependencies on the impedance chain, gain structure, feedback, reflections, etc., which seem well-suited to circuit and physical modeling. Digital pedals, as I note, are largely software anyway so it doesn't make sense to me to try to model them with ML any more than it does to model Microsoft Word using ML (though I'm sure someone has tried.)

In general ML seems most useful when you don't have good analytical models - but in the case of circuits and software we have very good analytical models.

sillysaurusx · on June 2, 2021

That's fair, and true! But one interesting thing about ML models is that they're often much more performant. For example, it's relatively expensive to evaluate analog circuits digitally. An ML model that can do it on a raspi with no delay and no quality loss is interesting, to me at least.

ushakov · on May 31, 2021

actually, the researchers of aalto university (now at neural dsp), who pioneered the guitar ml technology, were working on speech initially and did this one as a side project

source: https://m.youtube.com/watch?v=WLTzbEKTxhk