For analog pedals, I'd expect better results from circuit modeling.
Given the wide (and very non-linear) range of settings of a typical pedal, as well as interaction (impedance, etc.) with a real guitar and amplifier, it seems like it would be a pain to get all of the training data.
For a digital pedal, running the actual (e.g. Eventide) DSP code is just going to be better than some ML approximation.
On the other hand, I've been a bit dissatisfied with amplifier and cabinet models based on traditional DSP and physical modeling approaches, so maybe neural networks could fill in some of the gaps.
I wouldn’t discount ML. The nonlinearities are the bread and butter of modern ML models. In fact, two linear layers without a nonlinearity inbetween is equivalent to one big linear layer. So nonlinearities are required.
To put it another way, I would gladly bet any reasonable sum of money that in a double blind test, the listener wouldn’t be able to tell the difference from a genuine guitar pedal. (Not necessarily this pedal, but I suspect ML will model the effects more than adequately for human hearing precision.)
FWIW, I say this as someone who used to argue that graphics programmers were doing gamedev all wrong because they weren’t modeling light, they were approximating light. ML models were the way out.
I also think much of the problem is that ML devs often don’t have traditional signal processing experience, so they haven’t been modeling signals in quite the right way. (I’m trying to rectify that a bit with my FFT tutorials: https://twitter.com/theshawwn/status/1398796224921321472?s=2...) It remains to be seen, but Fourier space has recently been making strides in ML, and it’s likely much easier for a model to approximate a nonlinear waveform in frequency space than as a raw waveform.
To put it another way, if human speech is getting to the point where ML models can trick people, what are the chances that a future model won’t be able to do it for guitars?
I've heard images are better modeled in DCT space (which isn't based on complex numbers) because it's better at energy compaction than FFT, and also because it doesn't assume that the image is periodic. Also some people think that the FFT is insufficient, even for audio, because it doesn't model time-domain hearing perception. Some people say that wavelets are better at modeling images than purely frequency-domain transforms, because they take spatiality more into account. From what I've heard, wavelets work well for modeling human vision (in fact convolutional neural network input kernels tend to converge to Gabor filters, which I don't know howw they differ from Gabor wavelets) and noise reduction, but have fallen flat for image/video compression codec design.
All excellent points, and I think you should DM me on twitter to chat about this more. (I hope you will!)
DCT is on my radar. But there are several serious limitations that I think are overlooked. For example, convolution is no longer a simple component-wise multiplication. This seems, to me, a big deal.
In other words, you're probably right, but I'm focused solely on FFTs on the (very low) chance that people have overlooked something that will work well.
Sorry I don't work on neural networks much, and have my plate too full with other projects (and my DSP is a bit rusty?) to hold a conversation on this right now. And I don't use Twitter much either.
> The nonlinearities are the bread and butter of modern ML models
I guess I didn't get my point across. What I meant was that pedal settings tend to be non-linear with multiple sweet spots (which often depend on the guitar and amp) so you shouldn't just do a linear range from 1-10^N (where N is the number of knobs) for training data, as someone else had suggested. Moreover, there are also dependencies on the impedance chain, gain structure, feedback, reflections, etc., which seem well-suited to circuit and physical modeling. Digital pedals, as I note, are largely software anyway so it doesn't make sense to me to try to model them with ML any more than it does to model Microsoft Word using ML (though I'm sure someone has tried.)
In general ML seems most useful when you don't have good analytical models - but in the case of circuits and software we have very good analytical models.
That's fair, and true! But one interesting thing about ML models is that they're often much more performant. For example, it's relatively expensive to evaluate analog circuits digitally. An ML model that can do it on a raspi with no delay and no quality loss is interesting, to me at least.
actually, the researchers of aalto university (now at neural dsp), who pioneered the guitar ml technology, were working on speech initially and did this one as a side project
> On the other hand, I've been a bit dissatisfied with amplifier and cabinet models based on traditional DSP and physical modeling approaches,
Have you tried the Fractal stuff? I've been using it since the first generation (consistently for live use since the Axe-Fx II days) and they've been ahead of the competition since their inception. At this point I'd venture to say that the majority of their amp models sound indistinguishable from the real thing with no advanced parameter tweaking.
That being said when I was between Fractal units about a year ago I spent a brief amount of time with the Pod Go and was immensely impressed with how much they were able to pack into a $500 floor unit. Most of the amps and drives still felt a bit caricature-y but were still very usable - a far cry from the Pod Bean days. It's truly a great time to be a musician.
It is my understanding that capture based emulation of a given sound seems like a better idea than it really is. Say you capture a given amp or pedal sound - even if you get the latency down to acceptable #s, what if you want/need to turn a knob? Tweaking knobs is an essential part of the process of dialing in a sound relative to your guitar and output environment.
Contrast with the Fractalaudio approach of modeling each component of a device. Fractal's AxeFx is the gold standard and any geek would gush over the HW and SW engineering. The best part is that the company owner keeps improving his algos and pushing out updates for free. This device costs the equivalent of a good amp head and is loaded with more amps and effects than any of the competition.
Sorry if this sounded like an ad, but I am always surprised how little airtime this amazing product gets in hacker circles.
The AxeFX is old news when it comes to guitar modeling. The newest Kemper firmwares do a much better job, and NeuralDSP's Quad Cortex does an even better job still along with the ability to capture distortion pedals.
Modeling real gear is all fine and dandy, but what really has potential is being able to replicate circuits that aren't viable in the real world, like a tube amplifier on the edge of occilation or one run way over the rated TDP of it's tubes, situations that might provide sonic possibilities that aren't viable long term can now be stable and replicatable long term. Additionally, being able to modify a tone stack in software to provide extra flexibility is very handy.
The Quad Cortex is also slated to receive updates that will allow it to capture modulation effects, and will undoubtedly get better with time as the Kempers did.
>he newest Kemper firmwares do a much better job, and NeuralDSP's Quad Cortex does an even better job still
Umm. You and I are just random people on the internet. I have spent a lot of time trolling the places pro musicians talk about the leading devices and I have found that Fractal is very widely considered to be the best of the best as far as sound quality goes. Kemper is known to be easier to use and Neural is the new kid on the block that has bluetooth, touch screen on device and footswitches that act as knobs. For the limited # of tones you get, it is supposed to sound great. None of those features are advantageous to me at all.
>being able to replicate circuits that aren't viable in the real world
That is definitely something that Fractal does. Considering the two devices you are talking up are based on capturing actual tones from real world devices, I am unsure of your point here.
>capture modulation effects
Modulation and time based effects have been modeled to perfection in the digital realm for a long time now. See the ubiquity of Strymon, etc. Fractal has equally good algorithms and allows incredibly intricate and signal paths as it is an all-in-one device. I have four expression pedals and 10 switches that can be programmed to control any parameter I would desire with a couple clicks of a mouse. No other multi effects device brings this degree of controllable complexity.
The knobs can be included in the emulation. So that is to say, you model the unit as something having an audio input, and several controllers, such as knobs or switches. You then capture the behavior for all combinations of controller values, as part of the model. ("All combinations", for a potentiometer, might mean stepping it from 0 to 11 in increments of 1.)
The emulation ased on the model runs on a piece of hardware which has some generic controllers on it: some rotary encoders, a couple of switches and whatnot. These get assigned to the parameters of the model.
You could have a MIDI input on it, and use MIDI controllers, which would be cool. There are MIDI foot controllers that you can tilt with your foot to vary a parameter.
The different models could be assigned to MIDI program numbers. You could change the patch number with the foot controller, and vary the parameters with it also.
The foot controller might have, say, only two pedals, so you have to assign which ones you want: if the patch has five parameters, you have to fix the values of three of them and map the two important ones to the foot pedals you have. For the others, you can bend down and tweak the knobs on the unit itself.
the problem with this approach is you need to have knowledge about the circuit to make it happen, not only that, but you also need people, who can reconstruct the circuit digitally
this approach is not scalable (that's why the high cost)
the ML approach doesn't require any knowledge about the system (black-box) to produce the result
> this approach is not scalable (that's why the high cost)
It may be opposite, most of the amps follow some classic schematic (e.g. jtm, plexi, princeton) with insignificant changes, so after building digital copies of some limited number of classical amps they can add new one rather fast.
As result, fractal has about 100 high quality models already (average guitarist probably uses 5?).
> the ML approach doesn't require any knowledge
ML approach requires you to capture training data: which is different sound samples with all possible knobs positions (8 knobs per amp in average) + different types of speakers and mics, which is very huge number of variations.
> that's why the high cost
cost is driven by market, ML profiling competitors (kemper, neuraldsp) charge about the same for their devices.
> sure, but can it model my amp? i doubt they will ever add it!
This is very different and more narrow use-case.
Also they have tone match like forever, you build signal chain close to your amp, then add tone match block, which applies ML to voice your digital signal chain close to recording.
Also in this example he didn't profile actual amp, but actual AC/DC recording, and result is very good I think.
> how many presets do you really need (average guitarist probably uses 5?)
But how you find this preset for your signal chain (guitar + speakers)? That's one of big points of frustration with kemper: one needs to go through hundreds profiles (not necessary good quality) to find one which will sound good with his signal chain.
With fractal: you take some basic preset, and change knobs to your tastes and goal as with real amp.
This is exactly what Fractal has achieved. They took actual amps and modeled the actual circuits. Basically, this work was done for a massive number of amps by one dude. And he sells this device for $2k - which is ~ the cost of a single good tube amp head.
Just a quick note, you could solder together most classic guitar pedals for $20. You could buy reissues of many of them for less than $100. Big Muff, Proco Rat, Klon Centaur (Klones), Tubescreamer, Dynacomp, etc are all available for less than that. Just being clear that this is not a price issue.
This is fine if you like that sort of thing. I would note that latency is very important here: it's not going to be nice to play through if it's incurring any significant latency.
As the creator of this project I can assure you that the audio used here is at least CD quality (44.1kHz 16bit). With the HiFiBerry hat the digital audio comes in at 24bit/192kHz. The NeuralPi DSP processes the audio at 44.1kHz with 32 bit floating point precision. No reason the sample rate can’t be higher though. Elk OS claims latency is less than 1ms, but I’d like to test and see exactly what the latency is running the plugin. As a guitarist, I can’t tell the difference between this and an analog effect.
Not necessarily... why? We have been very successful with amp / pedal modeling through regular DSP methods. You'd be very hard pressed to find an album that doesn't use one nowadays. What makes NN methods fundamentally different?
>it's not going to be nice to play through if it's incurring any significant latency.
Yes but this is a non issue for many of the current systems, actually since a couple decades ago. 1-2ms latency is pretty achievable, especially with a RT kernel. That is the natural latency of a sound source about 1 meter away from you.
Yes, latency is huge. Modern digital audio stacks in consumer OSes are still completely terrible at this. Not that it's an easy problem to solve.
But it's pretty hard to beat elections flowing through an analog circuit, when in the digital side you have to: convert analog to digital, run through the kernel to get to user space, run the bits through the RNN, send back to kernel space, convert to analog and finally send to an output.
In order to be competitive with a 1980s guitar pedal, you have to do all of that in under ~10ms latency, and that's just really hard still.
Even though we carry around these super computers in our pockets these days, there are still some things left where analog still beats the pants.
Yes, latency is really important, though it is important to note that digital signal processing doesn't need that huge stack. In fact, audio DSP chips have at least 25 years of history and the result is that today you can get things like the Vox AC30 which is a digital headphone amp--all digital--that has no perceptible latency. That one in particular, sounds pretty darn good, for just being a single battery-powered chip!
> In order to be competitive with a 1980s guitar pedal, you have to do all of that in under ~10ms latency, and that's just really hard still.
Well, don't use a whole PC with software stack. A custom embedded solution with DSP can easily manage.
> Even though we carry around these super computers in our pockets these days, there are still some things left where analog still beats the pants.
Again, it's not digital vs analog, it's massive software stack versus embedded hardware/software solution.
Also we had electronic simulators for various instruments for ages now.
Things like Clavinova keyboards Piano or Line 6 Pods simulating Guitar/Bass amps & effects have been out their for decades now.
And while they have been quite popular due to the sheer number of sonorities and the convenience they bring (possibility to play with an headset, extremely useful to play at night or in apartments), traditional analog setups remain strong.
Playing on an analog setup still is more pleasant and more expressive IMHO, in particular "simulators" tend to mask the attack when hitting a note, and hides a lot the tension/crispation in the hands/fingers when playing, leading to potential bad habits, specially for people learning to play an instrument. Analog to digital and digital to analog conversion definitely lead to loses in expressiveness.
I'd say lower guitar skill ranges (me) get an improved sound from modern digital effects and tools. As the skill gets higher, those effects, especially high compression, mask your style and desired end sound.
traditional methods were successful in emulating sounds, but they fail short at replicating the feel and response of the hardware they're trying to replicate
> Modern digital audio stacks in consumer OSes are still completely terrible at this. Not that it's an easy problem to solve.
Honestly, the only OS that's truly bad at this is Windows. DirectSound is laggy and highly limited in it's capabilities, and even a nice ASIO won't fully alleviate your issues. Your best bet is to get a DAC and hope for the best. Besides that, I've found Linux and MacOS to be very similar in terms of latency, out of the box. However, I've found that tuning Linux with a custom low-latency kernel absolutely destroys CoreAudio's latency. Given that it's something most people won't be doing, I think it's fair to say that both OSes are tied, but I still give the edge to Linux for having a more modular and adaptable sound backend.
How huge is the latency? It's using Elk Audio OS, claiming 1ms roundtrip. I doubt this usage gets that low but I wouldn't be surprised if it was doing alright
The time in the RNN might be the only thing that's hard. We can commonly do A/D, D/A pretty quick. I can make a db request over a network, have it parse the SQL, execute it reading a bunch of SSD pages and return the sorted results in about 1ms.
The answer would probably be to reduce the 'learned' output to be a convolution kernel that gets run rather than the RNN itself on the input. Then the kernel only has to change gradually to produce a different sound not continuous processing to produce a particular sound.
I could see using something like this on a pedalboard in pedal form if you had the latency around <5ms or so for sure. I use a combination of analog and digital effects combined with analog amplifiers. I still prefer the actual amps quite a bit compared to simulations, but the simulations are a lot less of a pain. Recently I have been using a combination load box / cabinet simulator and I think those types of devices can really deliver in terms of tone and convenience.
> Modern digital audio stacks in consumer OSes are still completely terrible at this
This has been solved for over a decade. Linux is a bit tricky but MacOS and Windows have native low latency drivers. There are also a ton of digital effects units running at 0ms on the market.
Also, the competition is not against vintage analog gear. Modern analog gear is having a true Renaissance, and this community can afford to support the Wampler's of the world. I don't understand the modelling camp at all, their stuff just doesn't sound good, nor is there any joy in working with it.
I feel the same but it's hell convincing the reductionists that they haven't got 'the thing, and the whole of the thing' in their little emulation.
There's also another element: if you have, say, a vintage Fender Champ and a Klon (or whatever) it's because you mean to project different expressions through your string handling and note-playing. At that level you've made a best effort to produce the most emotionally transparent and responsive signal chain, which you will then not think about once you've got it turned on and tweaked: ALL the settings are liable to sound 'good' and respond for you.
The modeling approach is so often "This is exactly that, but better, because here are twelve other Fender Champs and models of Klon to choose from!" and when the first claim isn't as true as we would like, and the second is a distraction and time-sink, that's not great.
I can tell when I've chosen wrongly in my music-making tools, because I flat-out stop making music. Even in a dilettantish way: it just stops being a thing. That's a concern.
I have this problem just with the torpedo captor x speaker simulator. At first I thought it was a game changer for apartment guitar playing. However the plethora of speaker, mic, and placement options makes me spend way too much time browsing and fussing rather than playing. I’d rather have one decent cab in a room where I can turn up the volume rather than hundreds of simulation options.
FWIW, I have been using the Captor X as a quick tracking and editing tool. I will record the DI + Captor track. It can be a bit easier to comp the DI parts. Then later I go back and re-amp them thru my amps and speakers. I do have the benefit of a few nice amp and cabinet options and decent soundproofing, but it helps keep the ear-bleeding levels down to a minimum. That has kind of helped me avoid fiddling with Captor settings endlessly.
In all fairness, it's running on a Raspberry Pi. There's plenty of ways you could improve latency and audio quality just by scaling the hardware to fit your needs. Plus, it's not like someone's going to flip their shit when they realize that one of the electric guitars was recorded at 44.1khz instead of 48/96/192khz
There is no reason why this would run at “MP3 quality”, given that it would be a really bad idea to compress the audio data before running it through an algorithm. I would expect it’s at minimum CD quality, and perhaps better, depending on the fidelity of the A/D and D/A stages and the bandwidth of the algorithm.
No, processing deteriorates the sound from what you had in raw capture form. I stand by the assertion. MP3 quality at best, previous-generations-of-modeler more likely. It's not going to be Kemper grade running on a RPI, and that's still MP3 quality, just not 'super low bit rate MP3 quality'.
MP3 @ 320 kbps is fine. Latency is a different thing. Also it seems like you're dismissing this without knowing anything about the actual latency (or sound quality).
That's not necessary. And FPGA doesn't have lower latency (I mean on the ms scale) compared to a Pi because of hardware. It has lower latency because it doesn't run a non-realtime OS.
You can just use a Pi with bare metal code or a real time OS.
any pedal? That seems far-fetched - one trivial counter-example would be a looper as that requires modal input and state. I also wonder if it can handle complex multi-tap delays.
Can anyone give a rough idea of the actual limitations? I would guess that there is a limit to how non-local the effects it can manage are.
If you want to be pedantic it probably can't emulate a guitar pedal that you can tap a representation of a Turing machine into and it will only make a sound if the machine halts.
They claim to be using a LSTM, and I believe that any RNN-like architecture should (in theory) be able to learn a loop pedal.
If you aren't familiar with RNNs, think about it like a NN that instead of learning a input -> output function, learns a (input, state) -> (output, newState) function
Only if the entire loop audio fits into the RNN's hidden state? Or if you use some kind of external memory mechanism that gives the network access to previous audio.
LSTM stands for Long Short Term Memory. It's a recurrent network that learns what and how long things should be kept in its internal state buffer. It doesn't have a fixed state size because it's just learning a nonlinear function that takes an input and a state to an output and a new state. Obviously it can't model all possible, infinite length recurrences, but it can definitely do a pretty good job of approximating long term recurrence relations in complex signals.
I don't think that assessment is quite right. The hidden size is fixed - the second argument to Pytorch's nn.LSTM constructor is "hidden_size – The number of features in the hidden state h".
A call to `y, hidden = layer.forward(x)` (where x has a batch size of 1, and an arbitrary length) produces two hidden states of dimensions `(1, 1, hidden_size)`, where hidden_size is the exact number you passed to the LSTM constructor. Those two states represent the long term and short term memory features.
You would need to have an LSTM with hidden_size large enough to store the samples (or a compressed representation) of your entire loop. Not to mention you'd run into other issues with handling the logic around variable length loops based on a pedal toggle.
The hidden state isn't storing the samples of your loop (or a compressed version of your loop). It's encoding a representation of how the output will change based on what the current state and input are. This might be strongly dependent on what the exact samples in the loop are, but it could also be more general. I think it's missing a bit of the representational power of an LSTM to see the state representation as just a buffer of the current input.
But, yeah, at some point your signal has such a complex behavior on long time scales that there isn't a good way to predict it based on a limited state size (or at least gradient descent can't find a function to predict it for you).
If you can reproduce the original information based only on a state input, you have stored it in the state (in an encoded form or not). If your state is smaller than the original information, you have compressed it. If your reproduction is not faithful to the original, you have created lossy compression.
If the future input samples have a meaningful impact during loop playback, then it hasn't learned the correct behavior of the original loop pedal.
Note that the linked project appears to use a hidden size of 20. Twenty floats. With that much space we're very much back to "sure, you might theoretically be able to loop if the information fits in the hidden size".
Increasing the hidden size beyond 20 still won't solve learning the complex state machine behavior of an original loop pedal, which can loop variable length audio. You'd need to provide the pedal state to the network in addition to the audio, and probably train need to train it on a bunch of different loop lengths (>thousands?).
This would mostly be an academic pursuit, as it's extremely impractical compared to the other uses of the device.
This is the kind of thing where augmenting the NN with an actual raw audio buffer would allow it to mix in some of the signal from the past quite easily.
you could model amplifiers, distortion effects, phasers and flangers, according to CoreAudioML project, which makes this possible
https://github.com/Alec-Wright/CoreAudioML
You can model most useful nonlinear functions with neural networks, this is unsurprising. You can also use Volterra series. You can even estimate/measure the Volterra kernel then train a NN to model it instead of dealing with the computational complexity of generalized convolution for nonlinear dynamic systems.
The hard part is that there are some fundamental limitations to deal with. The biggest is aliasing - distortion effects in particular deal with enormous amounts of distortion (> 100% THD) which creates spectrums far outside the range of hearing. Digital audio systems need to have high orders of oversampling to prevent audible aliasing (8-16x is not unheard of!).
After aliasing is memory. It's too early in the morning for me to do math but I'm almost certain you can't model a looper with a causal NN that has less internal state memory than the length of your loop. Doing so is dumb anyway, since loopers are pretty trivial and their biggest cost is memory. Same goes for digital delay and modulation effects, the algorithms are not expensive.
Now I wonder if a NN would be able to learn a nonlinear effect without aliasing, even if run at the original sample rate. Oversampling and filtering are, after all, things that could become part of the model too. Perhaps it can learn to approximate them with less CPU cost than doing it for real.
Oversampling requires producing more output information than input information. It would be incredible for a NN to realize a system that could do this without requiring more memory and CPU cycles than a good oversampling algorithm, which can be derived analytically with various definitions of "optimal."
A 3rd or 4th order polynomial interpolator is pretty darn good and doesn't need a NN to find the coefficients.
I'm not talking about using a NN for oversampling; I'm talking about whether an NN could learn to reduce aliasing when implementing nonlinear functions without any external oversampling.
Oversampled DSP algorithms work by oversampling, performing the nonlinear processing, then filtering to remove the higher harmonics, and finally downsampling again. We do it this way because it's convenient and easy to understand and based on proven mathematics. But nothing says these steps have to be distinct.
An oversampled DSP algorithm looks like a regular DSP algorithm from the outside, perhaps with some more state and latency required for it to perform the internal oversampling. You can also imolement such an oversampled algorithm entirely at the original sample rate clock; it just means the processing needs to internally process several samples per outer loop sample.
Since neural networks excel at modeling "black boxes" as one amorphous blob that we don't understand, I wonder if a NN could learn to model such an internally oversampled algorithm fairly accurately, and what the computational complexity would be.
Since you can model the oversampling/filtering/etc steps as linear convolutions with wider internal state at the original sample rate, I'm almost certain this will work with the right NN topology. It's obvious an NN can implement oversampling.
And so my question is: could treating the combined oversampled processing as one step, and training a NN on that, potentially result in a more efficient implementation than doing it naively? Especially for heavy distortion that needs high oversampling ratios.
It can’t do time based effects such as delay, reverb, flange, chorus, etc. The LSTM can model distortion, overdrive, compression to an extent, and amp circuits (including vacuum tubes).
But I will say that there is a project out there for time varying effects, but haven’t dug into it yet. I believe it’s a different method than the LSTM
In my attic there is a dusty box with a multi effects unit from the 90's.
The reason I stopped using it is that it is not that the fuzz doesn't react to the guitars volume pot like a Fuzz Face or because the Tubescreamer was a TS9 not an 808. It was because having a single box with all of your effects in is a faff to tinker with. I like the dedicated hardware of my pedals, I like having the right number of knobs. I like being able to turn off the fuzz but keep the delay by stepping on the fuzz switch. I like being able to run the TS before or after the fuzz and to see it happen. I mentally am much more at home with little boxes for each stage, it is like a real world flow chart!
You can do all that with modern multi-effects, and even more.
I'm running a hybrid setup, using a Line 6 Helix in front of a Soldano SLO 30 (and on its effects loop as well). Each effect can be adjusted (with all sorts of knobs, more than 12 for some), moved around and triggered individually or together. You can even use the Helix to toggle channels in the amp, manually or as part of presets.
It drastically eased up my workflow to experiment with and mix effects. No more tinkering with cables. No more wondering which pedal is plugged in wrong or doesn't have enough power. No more velcro strips. Plus, selling my hoard of pedals felt nice.
I recently watched a video[1] by Rhett Shull where he compared Neural Capture of a Quad Cortex[2] to real pedals.
I'm not a musician, but even I could easily tell the difference a lot of the time. It didn't sound bad, but there were clearly aspects the Neural Capture failed to, well, capture.
Similar-ish but clearly very different. I noticed that in that video he was using the WaveNet model, though apparently the statefull LSTM model performs better[1]. Though even for that you can hear[2] a fairly clear difference.
As much as I'm torn with my feelings about nVidia as a company, especially recently with their attempts to artificially limit hardware you own, I must admit Jetson family is incredibly capable and very well executed.
I've been using Jetsons as RasPi replacements wherever I can, they are not only more capable but also much more reliable than RPI4s, in my not so limited experience.
I wouldn't say that. 2GB dev kit is $50 depending on where you buy them. That's much more bang for the buck, even if you opt out using the tensor cores they were designed for.
M2 port would have been awesome!
I believe the comment stating Jetsons were expensive was made in comparison with the RPI4, which also lacks an m2 slot. I really do find Jetons much more affordable when performance is taken into account, again, Tensor aside.
They can also boot from USB3 as of recently, which boosts storage access speed tremendously.
I'm guessing a Hi-Z input before this contraption could improve the tone and do the pickups of the guitar or bass justice. A dedicated buffer pedal or just a pedal with buffer bypass perhaps?
Most definitely. A radial reamp DI would improve the tone on this rig noticeably. Even if you didn’t use reamp specific hardware, using 2 direct boxes to match impedance on the input and output of the Pi would help.
The video won’t play for me, but does this allow tweaking of parameters? The problem with most projects like this one is that it replicates the sound of a certain pedal with certain settings. That’s a major problem for most guitarists as playing with the knobs live is a big part of the attraction to pedals.
No, it doesn't. But to be honest, this would be next level. When looking at this, I'm thinking of alternatives and I can only name 2:
1 - Kemper Profiler
2 - NeuralDSP
Both of them are above 1000€/$. We are talking about 10x the price of this thing. Add some Multi-FX Pedal (like Line6 HX-Stomp) where you put this in the FX-Loop and you end up with something equally good for still half the price.
And in general, its not about what is better, digital or analog. It's about the use-case. In the studio or when noodling around with the knobs when practicing: Real Amps and Pedals. But on stage, you don't play with the settings of your pedals or you want presets. This is where you go digital. No one will notice the slightly different sound there anyway.
Yeah, true. But you would only change it between songs, I guess, and only to the values you always use for that song, right? - This is what I would capture and use as a preset, preset change triggered by MIDI from the MultiFX unit.
I know some who change their settings during playing, but I wouldn't consider this as standard. Also, you can still add a real Pedal to a MultiFX Board if you really want to change its settings live.
On the contrary, automatically changing the settings controlled by your DAW and triggered via MIDI into your MultiFX unit is also pretty common. You don't have to hassle with your Effects at all and can concentrate on playing, what you should, as you need to be absolutely on point.
I designed and built a eurorack module around a pi zero with a usb audio input/output, voltage divider, op amp, and some pots connected to adc. Using rtaudio the latency was pretty low and gave me control of the parameters I wanted.
Given that this was the only really active process on the pi, it ended up working really well. I simply converted modules I had written for vcv rack.
I also built one that enabled usb host mode and acted as an audio device that worked with any daw. Ended up being pretty cool for about $20 of parts.
While not as cool as an ml system, given that I was already writing dsp, it ended up being pretty neat.
Honest question: if you're a musician, what is the appeal of digital modeling? Is it purely affordability/accessibility, or are you drawn to it because it would create different sonic possibilities that you couldn't get from the original?
Speaking for myself, the accessibility and convenience is a huge part. Instead of lugging around a 60 pound fragile finnicky pedalboard, I just throw my Helix in a backpack and am good to go. It has way more pedals in it than I could fit on a real pedal board, and I can also switch presets with the tap of my foot (including rewiring all the connections between pedals, changing their order, swapping pedals, swapping amps, etc).
And yeah the sonic possibilities are endless. On a physical pedalboard, it's pretty involved to rewire everything to change the routing, or add/remove pedals, etc. With the digital modeling ones, this all becomes trivial and you can try all kinds of different setups much more quickly, save them and go back later, share them with friends, etc.
All that said, it's kind of like asking an acoustic guitarist in the 1950s why they would use an electric guitar. Electric guitars obviously have lots of advantages, but it's not like people stopped playing acoustic guitars. I still think real analog pedals are cool, they're fun to collect, in some cases they sound better, etc. And sometimes you don't need the mega-flexibility -- if you have a 4-pedal setup that does "your tone" maybe that's all you ever need.
Flexibility and affordability. As a guitar player, if I walked into a room and on one side there’s a perfect digital emulation of a Marshall stack, and on the other side there’s and actual Marshall stack, I’d go for the real thing every time. But the reality for most musicians, myself included, is that I wouldn't be able to or wouldn’t be willing to pay that much for one. Now if I had a whole library of digital amps/pedals to try, then I could find something I like best and go get the real thing. I don’t see digital as replacing analog, only enhancing it. It’s also nice to be able to plug in headphones, and from a laptop or pedal that takes up much less space.
But as an engineer I just nerd out over all of it. Analog. Digital. If it makes good music it’s all cool to me.
>are you drawn to it because it would create different sonic possibilities that you couldn't get from the original
That's my primary motivation. On my AxeFx, I have signal chains that are impossible with real gear. If I want to tweak that chain, it is a couple clicks of a mouse. I have four expression controllers and 10 foot switches that are tied to different parameters. I can tweak this functionality on the fly. All in a single $2k box.
Beyond all, tube amps are stupid for bedroom players as they generally require gig level volume to get the killer tonez...
I deliberately bought quite a limited but good sounding 4W tube amp for bedroom practice (Vox AC4). Having only a handful of easily understood knobs allow me to focus on my playing.
I have had modeler amps in the past, but every so often I'd just nerd away with the dozens of available amp models and the myriads of settings, and come out with the dissatisfied feeling of having just wasted a lot of time instead of engaging with music.
I find modelers have their place when it comes to replacing a set of analogue pedals, which is the reason I traded my three Boss pedals (compression, reverb, delay) for a Boss GT-1000Core. Overkill for my purpose, as I really never use any amp sims, cab sims, of any of the advanced signal chain stuff that the device is capable of. I just have one patch with my three pedals for practice, going into the AC4, occasionally turning the same knobs as on the analogue versions, but enjoying that I now have built-in a tuner as well :-)
You mean, besides the appeal, what is the appeal? As an artist and a musician, you can have access to close approximations of sounds that would require a warehouse full of amps and dozens of pedals, costing tens of thousands of dollars - all in a Pi project box. That’s not just a little bit of affordability and convenience.
And I certainly don't mean to sound dismissive about the possibilities that creates! It's a huge factor and I would see it being akin to how the rise of "pro-sumer" home recording equipment played a huge role in underground punk music.
I want to know if anyone ever uses this to create a fleet of classic microphone sounds from a working mic like the SM57. Sennheiser, Neumann, Telefunken, AKG. Hell, you could stack it with preamp filters like Neve. It would bring me great pleasure to see a studio chock full of SM57s masquerading as the best microphones money can buy.
The cool bit about this to me is that $120 in hardware is a the one-off prototyping cost. As a product it could be made much cheaper.
Another benefit to a software defined pedal is that it can express sounds that cannot be replicated in analog. Emulation is boring. Train it to do stuff that I can't buy in a pedal!
Line6 was doing this 10-15 years ago and it wasn't that cool then. Eventide has some tools for it today tool, there's also the Owl, various teensy projects, and of course Kemper.
I've had a Line 6 Guitar Port for about 20 years. It was $99, and emulates amps, cabinets, pedals, console channels, rack effects, etc., admittedly offloading the work into a computer that (at the time) cost about a grand.
Are they perfect emulations? No. Did anyone notice that in my music? No. Would I take that rig on stage? Also no.
It did help me find sounds I liked though, and over the years I've bought hardware equivalents of some of my favorite emulations, and I've bought hardware that goes beyond anything Line 6 can do.
As to whether Line 6 is cool or not, NIN/Trent Reznor toured with Line 6 rigs to sold out arenas 20 years ago, and today the live rigs are managed by the FOH using software emulations that can be automated or managed from on-stage controllers. Maybe you don't think that's cool, but the important takeaway is that you should use the tool that's right for you.
I think it's good to see people tinkering with new ways to reduce the cost around modulating audio signals in interesting ways.
Yo People!! A friend keeps sending links to guitar pedal discussions to me and I thought I chip in to the conversation.
While you can emulate any guitar pedal with AI and you don't have to pay 120$ I'd like to introduce you to some understanding.
MUSICIANS love to purchase shit they can show off to others. it's in their ego nature. While 120$ for software might compete with quad pro cortex pedal, you can't show off with 120$ but with a pedal you bought fo 2g you can.
That is why BOSS made a big travel box for physical pedals this year, this is what they released at the NAMM2021.
How do I know that? I make software for them. GarageBand has 1000s of different pedals already and those are free. most musicians have apple or abelton live or some similar.
Also, not to discourage anyone - live music is dying. More and more you see musicians dancing on the stage with their instruments pretending they are playing, while pro-edited backtrack is going through the speakers.
If you want to make this world a better place think how to improve housing/farming/healthcare with technology. Things like roof above the head, food on the table will be always in demand. There is no money to be maid with musicians, cause they don't have any. it took me few years to realize, you can learn on my mistakes.
If you have some cash and time and thinking if you need to build some software product - kill that, buy some land and build a cabin. In the end you'll have another property. you can sell it or rent it. you can live it to your kids, there is value there. There is no value in your software idea unless people use it like crazy, and they really won't cause they already have 100 pedals and that new pedal bag from boss
Not much need to (or benefit in doing so). If you're somebody like AnalogMan, building pedals with careful selection of NOS transistors wired point-to-point, this is so utterly not a threat to you.
I've seen entirely analog circuits designed and built with the PCB parts and techniques used in these devices, that are miles away from the quality of good pedals, without even being digital or modeling at all. I'm sure the problem circuits measure completely fine, and then you A/B them with a truly great pedal and it's chalk and cheese.
If you try to argue the point, people committed to the digital modeling model get quite fierce, so tone hounds learn to just roll their eyes and not engage. You can also think of it as learning to rely on secret weapons.
"tone hounds"? There are many hundreds to many thousands of pedals. Everyone has their own favorites. There is no "right" in this world. There's just as much chance that whatever "error" a modelled pedal may embody is preferred by its users as there is that it will be deemed worse.
Andersons has a few YouTube videos where the Kemper profiler fools tone hounds enough that you can infer that it's already good enough. The NeuralDSP hardware is of at least similar quality, maybe better in some instances.
Sure if you run a null test these will fail, but in real life it's really up to how honest the tone hound in question is, and if you can trap them into being honest.
If you're talking product viability, something like the Pepsi challenge, double blind testing, would probably be effective marketing.
I'd put money (a few thousand USD) up against anyone who can fool me that I'm playing on my rig when I'm not. It's a very simple rig. Because that'd be valuable if I ever lost and I'd collect up until I did.
There's no link to the vst. It would be neat if it was something you could just drop into your DAW and try to train it in wacky effects chains. The artifacts in it's sound might be really neat.
Now that you mentioned samples: I wonder if neural networks will be able to help with polyphonic note detection, so we can trigger MIDI samples using an off-the-shelf guitar or other instruments.
There have been a few recent advancements lately (Boss SY-1), but even the supposedly "ideal" solutions, that require a new polyphonic pickup, are not good at all. I have a Fishman Triple Play and a plugin whose name I forgot, and tracking is frankly terrible.
In which Pat Metheny tries to replace his whole band with guitar-triggered control of real, non-guitar instruments (keyboards, drums, etc) ... "Orchestrion" ... https://youtu.be/KsYEOUKS4Yk
Yeah, it's honestly not that good. I think Metheny uses an Axon. Even with those you need to be very careful with your phrasing, you can't be too fast, you lose a lot of dynamic range and a lot of expressivity, sometimes notes just die, the latency is high...
Here's what Metheny said himself: "But the guitar‑to‑MIDI part has always been a problem. It's a question of physics. On input, I sort of have to rush. But I know how to rush. I play ahead."
It's fun for lots of things, and you can make lots of cool music, but it's still very limited to certain styles, dynamic ranges, phrasing, tempo, speeds...
polyphonic note detection is largely solved at this point.
But "solved" here means "when not doing the analysis in real time". The realtime solutions are not as good. NN's are not typically great at realtime either, so this may not help very much with this particular goal.
I never claimed or implied anywhere that it does, and I know for a fact TFA doesn't involve note detection.
I'm just disputing the GP general assertion about NNs that "NN's are not typically great at realtime either", which is quickly disproven by the TFA which uses NN for realtime audio.
I wonder how it deals with dynamic range. I've always been told that one shouldn't run an electric guitar directly through a stereo because the peaks might fry it. So I would expect that whatever comes out of the RPi doesn't have much dynamic range anyway so computer speakers would be ok, but sound would be kinda meh.
"It’s a well-established fact that a guitarist’s acumen can be accurately gauged by the size of their pedal board- the more stompboxes, the better the player."
As both a software engineer and a guitarist, I'd say the opposite is true. Or at least truer. You can't do math-rock without a lot of pedals but the hard part is to acquire the chops. A lot of pedals, and production effects generally, quickly become cliches and it's like dropping down a musical black hole. Someone like Hendrix could take a new effect (superset of pedals) and make it work musically brilliantly but most pedal users buy the pedal to get someone else's sound.
..but most pedal users buy the pedal to.. show off in-front of others who don't have that pedal. period.
GarageBand has 1000 of pedals, people don't need more pedals, they need to practice more, but they don't understand that.
That's why they are buying more pedals while keep dragging about how much they care about green/clean planet.
Given the wide (and very non-linear) range of settings of a typical pedal, as well as interaction (impedance, etc.) with a real guitar and amplifier, it seems like it would be a pain to get all of the training data.
For a digital pedal, running the actual (e.g. Eventide) DSP code is just going to be better than some ML approximation.
On the other hand, I've been a bit dissatisfied with amplifier and cabinet models based on traditional DSP and physical modeling approaches, so maybe neural networks could fill in some of the gaps.