Hacker News new | past | comments | ask | show | jobs | submit login
Elk OS – Audio Operating System (elk.audio)
322 points by ushakov on May 24, 2021 | hide | past | favorite | 158 comments



It’s a Linux distribution with real-time kernel.

Here’s a download link if anybody is interested:

https://github.com/elk-audio/elk-pi/releases


When I read things like this, I really wish BeOS, with its already pervasively multithreaded kernel, had won the audio OS wars. _Le sigh._


It's a bit weird but I have such a nostalgic connection to this very specific BeOS ability.

As a teenager I had heard about BeOS and I tried it out. I must have gotten it from a magazine's CD.

And by mistake, clicking around, I opened all of my mp3 collection's files at the same time. Each of the mp3s opened in their own little player window - and they all started playing at the same time, without the UI starting to lag at all.

It gave me an appreciation of what a computer can do


Not only because of that, being fully written in C++ and multimedia oriented, for me it had a feeling of second coming of Amiga.

Oh well.


Does Haiku share these qualities?


Yes.


So the real time kernel patches have been mainlined? How comprehensive are the real time features of Linux now?


Almost all of the original PREEMPT_RT patchset are mainlined now, but the features are not (all) enabled by default. Low latency will always present a tradeoff with bandwidth, and Linux is used for so many different purposes that one is not clearly more important than the other.

For more than 10 years, Linux with the PREEMPT_RT patch has been capable of more-or-less hard real time, with latencies in the 10's of usecs range. That can be thwarted, however, by a variety of hardware features (SMI being the worst offender, plus devices that hog the PCI bus for too long), so you are not guaranteed this performance just by running that kernel.


>more-or-less hard real time

Also known as soft realtime.

For hard realtime, you need guarantees. Linux is too complex and cannot possibly offer them; this is the territory of formal proof, with complexity growing exponentially with code size.

Look at seL4 for hard realtime.


Even with the "guarantees" offered by e.g. seL4, you still need cooperating hardware. SMIs on most mobos cannot be masked, and can take an eternity to be handled by the BIOS.

Fortunately, for realtime audio, you don't need hard realtime unless you're very close to overloading the CPUs. Soft realtime (i.e. 99% of all interrupts handled within N usecs, 100% handled with N*2) is good enough.


Right. Also for hard realtime with 100% deadline guarantees (absence of soft and programming errors assumed, for a second) you would choose an appropriate CPU like the ARM Cortex-R series. You'd also take care of the peripherals, i.e. choose 801.2AS or even Flexray.


100% handled with N*2 would be hard realtime.


No. Soft real time can miss some real time deadline.

AFAIK Linux RT kernel is designed to hit every deadline, but they just cannot offer a formal proof.

So "more-or-less" hard real time.


One of the problems is that the Linux RT kernel doesn’t stop you from using/writing devices/drivers that break those guarantees systemwide, as I understand it.


What kernel can give such a guarantee?


A kernel alone on a random hardware platform can't, but a kernel and a hardware core that watchdogs and flips the bird at a rt-violating driver+device, sure (I've toyed with VxWorks on devices that work this way).

Usually when you're entering that realm of such requirements, you're rarely talking about off-the-shelf devices.


One that has no such drivers? Which would then imply limited hardware support as the price for the guarantee, which is a trade-off a general purpose kernel like Linux isn't going to make.


seL4[0]. I'm not aware of any other.

[0]: https://sel4.systems/


A microkernel one, potentially at least.


>AFAIK Linux RT kernel is designed to hit every deadline, but they just cannot offer a formal proof.

And that's why it is soft realtime. No guarantee, no hard realtime. It really is that simple.


Linux has the ability of entirely freeing a few CPU cores with isolcpus though, so that you can implement your own real-time loop on it


Sure. And you can also run an RTOS on a separate chip, elsewhere on the same PCB.

Linux is still Linux, and it is still fundamentally not capable of hard realtime.


If anyone wants to try the as-much-rt-as-possible build in a nice ready package, there's the Xanmod project https://xanmod.org/ which offers it precompiled with a few other tweaks. It's aimed at gamer crowd, but I had a really good experience using it with music processing.


The default xanmod kernel is still not using the PREEMPT_RT features. They do note that "Real-time Linux kernel (PREEMPT_RT) build available [5.10-rt]." and that's how you get "as much RT as possible". The features in the normal version of their kernel will improve the user experience of latency, but will not address "realtime" in the way that PREEMPT_RT does.


Correct, the 5.12 versions don't have the PREEMPT patches, you'll need the 5.10-rt version.


Dumb question inbound are those features boot/runtime toggable or compile time options?


Compile time.


Elk is using Xenomai, it's not just PREEMPT_RT.

https://elk.audio/audio-latency-demystified-part-ii/


Is it possible to make a BSD kernel real-time. (No, IIRC.) I notice Elk uses Xenomai. NetBSD has a utility called schedctl that purports to control the scheduling of threads and processes. An example in the manpage describes running top(1) with "real-time priority". https://man/netbsd.org/schedctl.8

The ability to compile a (close to) real-time kernel seems to be one difference between Linux and BSD not often discussed.


The approach of making general-purpose OS kernels realtime seems like hard work to me. Surely what we want is a realtime microkernel, on which the conventional operating system can run as a (large!) task. As long as there is a way for processes running under the kernel to start realtime tasks directly on the microkernel, you can support all your realtime use cases, without having to make the whole operating realtime.


FreeBSD includes a realtime scheduler priority class (round robin per priority level).


what is a real time kernel vs a normal kernel? just "faster"?


'real time' generally means latency guarantees and trading some throughput for consistent low latency for interrupts and scheduling. So faster in some dimensions and slower in others.

The Arch Linux wiki has some description of their real time patches:

https://wiki.archlinux.org/title/Realtime_kernel_patchset


Arch wiki is an amazing source of practical information. It's like graybeard forums but distilled into a readable text. My last laptop ran Mint distro; whenever I searched around for HW issues it was always their wiki, fully stocked with useful info and steps how to fix stuff.


Yeah, I'm always amazed when I end up looking at the Arch wiki for a Debian issue, simply because the Arch wiki is more relevant and up to date.


wow thanks for calling that out. Just spent an hour browsing through it and was basically an Operating Systems refresher course. Great resource!


"graybeard forums" = ?


I'm referring to the usenet, mailing lists, various issue trackers and distro-specific forums were knowledgeable people diagnose and solve problems for others. The arch wiki has some of same diagnostics/troubleshooting content in more approachable format and kept up to date. Quite a few times it pointed me in right direction even if I don't use Arch.


ah so grey haired as in experienced people think John Maddog Hall. got it. thanks.


Often it's explicitly not faster— you intentionally do work in a more interruptible/resumable/chunked way even if it's less efficient to do it that way than to just block the system and do it all in one shot.

These kinds of tradeoffs don't make sense for a lot of common Linux workloads, which is why it isn't the default.


Yes, and: in return you get documented guarantees about how much time certain operations are allowed to take. As opposed to “We work hard to make it fast most of the time. But, no promises because sometimes stuff happens.”


Real time offers predictability. In some contexts being certain of a 3ms latency is better than going randomly from 1 to 4ms which could make the code faster on the average but with less predictable timings.


"More responsive," is the typical simplified description.


> 1ms round-trip

Ooh, a rare latency claim that actually specifies which type of latency they mean! One millisecond round-trip latency, if true in practice, is quite impressive.

I'm going to continue reading now... :)


I've played with Bela [1] which uses the same real time kernel IIUC. It had super stable audio performance down to a 2 sample buffer, which felt like programming hardware. I'm very happy to see more work in this field.

1: https://bela.io/


Bela is fantastic because it pairs good audio hardware with low latency software. Elk looks good, but I'd need something better than a Raspberry Pi in terms of audio IOs to use that full potential.


There's a decent-looking hires 6in/6out hat on the site store, albeit "out of stock".

Bela is indeed fantastic, but the CPU is a bit weak.


I’ve never understood the latency requirement for professional gear, are you able to articulate why there’s a need for lower latency than average consumer gear?


Latency means you can't hear what you're doing. This is especially important for drums and transient sounds because playing with feel will move the beat by around by a few ms. Adding a fixed delay to that makes it more difficult and adds cognitive load. Adding a small variable delay makes it impossible.

No one cares if your pad sounds are a little late, but variable latency absolutely destroys drum grooves, sequenced electronica, and the kind of super-precise playing needed for good classical piano.

You might think most people can't hear the difference. And they can't. But - as with other audio production values - they can feel the difference. Tiny variations in timing are the bedrock of many kinds of musical expression. If they're not there, people notice.


Three things come to mind--

1. If you strum the strings of a guitar and want your laptop to make you sound like Jimmy Hendrix, it had better output sound quickly enough that you perceive it happening "immediately." (I.e., it should sound and feel just like an electric guitar that's hooked up to an amp.)

2. Such a laptop like the one above had better make you sound like Jimmy Hendrix for every moment you are playing the guitar. The website doesn't state it, but they are making a claim to 1ms round-trip latency in the worst case. If your first strum of the guitar convinces you that the resulting sound is happening immediately, then every subsequent strum must also sound like it's happening immediately, too. Users don't merely rant about a sporadically janky digital instrument as if it were your run-of-the-mill Windows 10 UI. They throw it out.

3. There's something I call El Dorado latency, which is always half the latency being claimed and is eternally sought after by users. (Works for system latency, round-trip latency, etc.) Since the unit for measuring audio latency is milliseconds, and since "1" is the last whole number greater than "0", a claim of 1 ms essentially short circuits the search. E.g., if they had claimed "2" then El Dorado latency requires the user to imagine how much better the system would sound at "1". :)

4. Okay, I'm being a butt in #3 above. Round-trip latency in audio seeks to go down to "ultra-low" latency measurements for the same reason gamers want frame-rates to go up to gazillions. In each case it means developers can choose one of the following for their end-users: a) add more hungry, hungry hippos per unit of time, or b) make the same number of hippos hungrier.

Edit: clarifications


??

Half of 1 millisecond is 0.5 milliseconds.

Just because their published latency is the “smallest whole number” as you seem to be pointing to - doesn’t mean that it can’t be halved. What a strange thing to say.

Ableton measures latency down to the tenth of a millisecond. It’s really not a big deal.


And latencies add up when chaining hardware: 1ms for the effect pedal, 1ms for the amp, 1ms for the audio interface, etc..


what does the "hungry hippos" metaphor/idiom stand for? https://en.wikipedia.org/wiki/Hungry_Hungry_Hippos didn't help


I think it just meant more "stuff"


I thought they were referring to latency vs bandwidth


Different use case - consumer gear is generally only used for listening to recordings, whereas professional gear is often used for real time recording of someone who's playing an instrument while listening to the output.


It only takes a small amount of latency (5 ms or so) to noticeably impact performances by musicians. Latency is also an issue when you want to monitor a live signal along with other recorded audio, like in a DAW. It is not as big a deal with a playback only device.


rule of thumb is that sound travels roughly 1ft per ms, so 5ms is the same as standing 5 extra feet from the speaker. Most musicians don't have much trouble up to around 10-20ms.

Jitter is killer though - if the latency is randomly changing it's really annoying.


Is the speed of sound relevant in most recording scenarios? Assuming headphones and instrument/vocal mics that are within a few inches, then everything is traveling as electrical signals at the speed of light.

Or is this just meant as a visualization of latency in these scenarios versus what it would be like with musicians playing x ft apart from each other in a room?


I recently played on an electronic drum set and it was absolutely fine with headphones. But when we hooked it up to the speakers across the room there was a very noticable delay (~5ms as we calculated) when hitting the pads with the stick.

Since I only used acoustic drums until then I never gave much thought to the idea of having your drums sound delayed. It was quite eye opening.


It absolutely is. Many times when rigging halls for recording live events I've had to insert delay in to sub mixes to match latency to the back of the room. It's standard fair for a live audio tech (a thing I do on the side occasionally)


> Assuming headphones and instrument/vocal mics that are within a few inches

It is relevant, and monitoring via headphones or a floor speaker is a mitigation.

It's also a problem for large orchestras. If everyone else just plays along with what they hear from the people near them, the timing is going to be inconsistent across the orchestra pit even if you have some obvious audible reference like loud percussion. The mitigation is to orient yourself as much as possible around the visual cues from a conductor.


That's a good point, and I'd imagine instruments like a church organ would suffer from this too, though I know in that case there are a bunch of other sources of delay, like the time it takes for the pipes to fill with air.


The latter. The point being that they are trained, comfortable, and proficient in those scenarios and the same translates to a (consistent!) software delay of up to 10ms.


Yes it is true that you can deal with that amount of latency, but 10-20ms is way up in the 'very sloppy' range especially when it comes to recording.


Most musicians use headphones / earplugs in professional settings


which is interesting but not the point?


But it's in part because of the latency due to the distance to loudspeakers, which isn't an issue with headphones


I think the loudspeaker part was a comparison to understand it a little bit better.

"If the system processing your signal introduces a 20ms delay, it would be the same as standing 20feet away the loudspeaker you're using as reference."


Interesting, I wouldn't have expected it to be that low, though in the aircraft space, control harmony can be disrupted with as little as 20ms of delay between control input and control surface movement[0].

Do you have any research you can suggest for that 5ms number?

[0] http://www.klabs.org/history/history_docs/reports/dfbw_tomay... (Keeping the delay at or under 10ms was the recommendation I got from the study P.I.)


Here's one :

https://www.jstor.org/stable/10.1525/mp.2006.24.1.49?seq=1

And another (follow the links to the article, but I thought that particular graph was most informative) :

https://www.researchgate.net/figure/Audio-latency-tolerance-...


It's not that low. Up to 40ms (roundtrip) isn't a problem for solo playing, in my experience. And it's also a matter of training. Church organists learn to play with much larger delays.


I can deal with 10ms roundtrip, but not much more.

One issue is when you get some acoustic sound too (open headphones, bone conductance, singing) and pretty much any delay in the headphone signal causes phase interference.


Yeah, I have to add that for me 40ms is ok when playing a virtual instrument.

Did you ever see people who try to sing with headphones with a delay of around 200ms? That really messes up your performance.


With a virtual instrument you would usually not face the full round trip latency, but rather control latency (MIDI?) and audio output latency (usually 50% of roundtrip). So in the VSTi case that might be closer to 25ms than 40ms in practice. Often the latency jitter is quite dreadful in this case as the control signal is not synchronous to audio.


Press a key on a real piano - instant sound.

At 10ms (and possibly less), you will start to notice the lag and it will affect your performance.

https://www.soundonsound.com/sound-advice/q-how-much-latency...


Nope, you sit 3 or 4 feet from the soundboard in a piano, so the latency is around 3 or 4ms.


The speed of sound in air is ~330m/s so sitting about a metre away will produce ~3ms delay. However the speed of the vibration through human flesh (far denser than air) is about 4 to 5 times faster than the speed in air. You will "hear/feel" the sound transmitted through your body far faster than the sound transmitted through air. This maybe why I'm always seeing this magical "1ms" value.


The 10ms (or 5ms, depending on the source) is a threshold - under which a human perceives the sound as "instant", even though - yes, physically - it is not actually instant.


When playing a software instrument via an external hardware controller (typically via MIDI over USB), even a tiny delay from the musician striking a key or a drum pad to the resulting audio out of the speaker can become very problematic for instruments with a very fast attack, where exact timing matters a lot to the performance. Examples include drums and percussion, pianos, many plucked type of sounds. With slow attack sounds like legato strings, extra latency is generally much less of a problem.


If you're trying to play along with something, synchronizing with the beat, the latency will completely destroy your ability to get into the grove right if what your playing is heard with a delay.


In consumer audio, you produce the sound and you're done. In devices used for musical performance, you produce audio, which then feeds into the brain of a musician, who then makes an decision to change their behavior in response to what they're hearing, which then effects the audio. Those constant round-trips through the brain make it very easy to notice latencies. (They pull you out of "the zone".)


Not sure what you mean by average consumer gear, but when I'm performing, anything higher than a 15ms round-trip latency is starting to have a negative effect on my ability to play tightly. I can do with more latency, but that's the point where it starts being unsettling or more tiresome.


That's necessarily going to be aspirational in at least some respects, though—it's limited by the hardware.


I wonder if there's a way for computers to self-measure round trip. Even if you have to plug an aux cord between the speaker + mic jacks.


My first paper as an undergrad measured MIDI round-trip by tapping the input and output MIDI wires and then looking at how the signals were offset: https://www.kmjn.org/publications/MIDI_ICMC04-abstract.html

For keyboard input, one paper (whose citation I can't seem to find) used a clicky mechanical keyboard, pointed a mic at the computer, and measured the offset between each corresponding pair of keyboard click and sound output.


That is the way to do it. In Ardour it is part of the audio setup.


People have done experiments using high-speed cameras and such.


When I saw the headline, I wondered/hoped if it was an OS with an audio/only UI. I'm sure this is cool, but a fully thought-out audio-only UI would be fascinating.


As you might have guessed, many audio-only UIs have been developed for blind people over the decades. One example of an audio-only PDA (from the time when the term "PDA" was still widely used) was the LevelStar Icon. (Disclosure: I did some contract work for LevelStar in 2005-2006 and was friends with the lead developer.) After some digging I found an old audio presentation about the LevelStar Icon here:

http://www.vibug.org/audio/vbg0407.mp3


I started developing a blind computing system in the late 80s based on a BBC Micro and a very early speech synthesizer.

It never went anywhere, and I'm no longer in contact with the blind friend who was helping me, but I often wonder if today's solutions such as Apple's Voiceover aren't the ideal solution.

They seem to be trying to adapt the GUI for the blind, rather than trying to build a text/voice based system from scratch, like I was doing.

Siri / HomePods and similar appliances are getting there, I suppose, but they still feel like computing accessories, rather than fully fledged computers themselves.


As a legally blind developer who has worked on accessibility for a while (see my HN profile for more), I'm inclined to agree with you. We use screen readers like VoiceOver because, practically, we often need to access mainstream applications. But I think we tend to get too dogmatic about going mainstream, to the point that some of us are disdainful toward systems designed specifically for blind people, particularly if they involve custom hardware. I've thought about developing a Linux desktop environment with a shell that's designed specifically for TTS, but that also includes a more conventional screen reader for running GUI applications.

Do you remember which speech synthesizer you were working with in the 80s?


Yes it was a software synthesizer called "Speech!" by Superior Software, blown into a ROM.


Ah, must have been a BBC Micro-only thing, because I never heard about it.


It certainly was, yes. The BBC Micro had an optional hardware speech synth you could add, but it was very inflexible, limited to a fairly small set of pre-defined words. "Speech!" could render arbitrary phonemes which make it a lot more flexible.


Flexible enough to say the word "Citadel!" for example. :)

</nostalgia>


You can try it in your browser; Go to https://bbc.godbolt.org/?&disc1=sth%3ASuperior%2FSpeech.zip, then copy and paste this to the paste thing at the top right:

SPEECH SAY Hello *SAY How are you?


Thanks for the link. :)

My comment was intended as an incredibly subtle reference to the trivia that the BBC Micro game "Citadel" 'famously' had speech generated by "Speech!" in its intro screen (but appreciate you taking the time to share the link :) ) as shown here : https://youtu.be/Hu5vu9SgGZI?t=44

For anyone else wanting to try out the emulator link in the comment above, HN messed up the formatting, so the asterisk prefixes were missed on the first two commands:

    *SPEECH

    *SAY Hello

    *SAY How are you?


I'm very curious about your thoughts. How would you make a shell designed for TTS?


There's emacspeak.


I remember ADRIANE being included in the Knoppix ISOs back in the day. You might find it interesting: http://www.knopper.net/knoppix-adriane/index-en.html


It's less comprehensive but depending on your definition devices like Amazons Echo or Googles Home devices could count as audio only OSes


Although things may change a bit with the arrival of Apple M1 systems, the marketing mentions of VST in this context are fairly weasely.

For users, good look finding more than a couple of plugin developers that make their plugins available for (a) a Linux based system AND (b) the ARM architecture.

For developers, VST is just an SDK, and doesn't require OS support in any meaningful way. So sure, if you want to create VST(3) plugins for a Linux+ARM platform, you can do that, with or without Elk.


Its a great project. It's intended to help hardware makers bring pro audio gear to market with less cost and less risk. Not as a retail vst host.


Yeah, it is a great project (though it's not a new idea in any sense). But the positioning of being "endorsed" by Steinberg bothers me, because it means nothing.


Why does it bother you? Why not focus on technical substance. It’s not a new idea but if you explore the project you will say many excellent features. The fact that there is a raspberry pi based dev board is simply to speed adoption and provide a reference implementation. It’s an excellent project.


One such project is REAPER. It includes a lot of useful VST plugins. https://www.reaper.fm/download.php


Looks cool! …but what is it? Software, hardware or both? (And is the software part a Linux distro with a tweaked kernel?)


It's a pi header https://elk.audio/elk-pi-basic-dev-kit/

and an operating system that runs on pi


It runs on many arm flavors not just pi


"With Elk hardware companies can move away from dedicated chips and use general purpose ARM and x86 CPUs without any compromise in terms of latency, performance and scalability."

Less diversity decreases competition and overall technological development. In the 1980s, some OS developers actually blamed Unix for stalling the progress of OS research.

Wouldn't a world of Arm based synthesizer be kind of stagnant? Sure you can dream that compatibility allows more open platforms, but we all know hardware companies always end up making the end product proprietary anyway.

To be fair, I don't know a lot about hardware. I'd also like to guess that the world of music electronics has been longing for some sort hardware medium to sort out the mess of obscure ISA's and poor documentation that hardware engineers have to put up with.


I work in that industry; at this point most of the big players are on either ARM SOCs or SHARC, with a few plucky folks basically making x86_64 servers.

Once performance is comparable for the workload, ARM-based synthesizers aren't inherently any more stagnant than DSP-based synths, in the same way that ARM-based tablet apps aren't necessarily stagnant; if anything, the freedom from crappy compiler lock-in, incomprehensible instruction pipelines, assembly code, and (as you say) poor documentation. Moving to ARM, GCC/LLVM support, and (eventually) Linux makes all the things other than making neat sounds (UI, interconnect, networking, power management, etc) easier, so that developers can spend more time on the creative bits.


Looking at other FOSS OS efforts competing against Linux on the hardware space, I imagine that long term it might not be Linux per se, rather POSIX that will win out, with Linux being one among many.

I mean stuff like Zephyr, NuttX, mbed, RTOS, Azure RTOS.


There's a range of capabilities for sure, depending on how close you want your device to get to "This is a PC". Mbed, FreeRTOS, and the other small RTOS's are nice for what they are, but their support for more advanced things like multi-core operations, filesystems, multi-tasking and multi-user functions are still limited (and, IMO should remain so as they expand their support for lower-performance devices).


> Wouldn't a world of Arm based synthesizer be kind of stagnant?

The computer in a digital synthesizer is an implementation detail to the consumer. What they're buying is the synthesis engine, the way it sounds, performs and interoperates, the case, the user interface etc. In that sense the CPU used seems about as relevant to stagnation as the screw heads used for the case.


I don't work in the industry this OS is marketed towards. I do however have some experience in embedded development and synthesiser programming. Can someone more familiar with the field answer a few questions for me?

- The website says: "With Elk hardware companies can move away from dedicated chips and use general purpose ARM and x86 CPUs without any compromise in terms of latency, performance and scalability". Is this intended as a ready-made OS for hardware synthesisers, or as a way to host VSTs on consumer SoCs?

- Assuming you're not targetting an obscure platform, or have a cumbersome toolchain, is writing a bare-metal executive the most difficult or expensive aspect of engineering a synthesiser? I've done some experiments breadboarding simple synthesisers with Arduino and STM development boards plus a 16-bit DAC. In my case the most challenging aspects weren't getting a real-time executive loop or handling I/O. Mind you, I didn't go as far as implementing a GUI.

- Is real-time Linux not overkill for most synthesisers? From the outside looking in, it looks like it would add more overhead and complexity than solve problems. If your objective is just to host a VST, this might be a decent trade-off. Or maybe if you intend to use high-level languages for the GUI.


> real-time executive loop or handling I/O

I did some audio implementations on MCUs of all kind, and if I managed to pull out multitracked real-time mixed audio players under 4ms latency, I learn that the real problem here is RAM (other than computing power).

Many audio effects requires lot of RAM. Think of how would you do a 5 seconds delay effect, or good quality pitch shifting (FT and inverse FT), reverb, echo, etc. Before you know it, 128KB of RAM is not enough any more. So now you are dreaming about 128MB of RAM (as minimum).

If you want to process or output audio from/to network, USB or save the output into a file then you have to add that overhead to the equation.

Before you notice it, you'll start looking at small processor like the i.MX28 or the like. Bare-metal BSPs for those processors are quite expensive in terms of code/time. You get what the manufacturer gives to you, and that is, a BSP for Linux.


> - The website says: "With Elk hardware companies can move away from dedicated chips and use general purpose ARM and x86 CPUs without any compromise in terms of latency, performance and scalability". Is this intended as a ready-made OS for hardware synthesisers, or as a way to host VSTs on consumer SoCs?

Seems like both and more.

> is writing a bare-metal executive the most difficult or expensive aspect of engineering a synthesiser?

Likely being a big part of the expense, yes. Obviously the hardware parts being a big chunk of it too, programming the software side of things are also expensive and sometimes makes the product receive less updates in the future because programmers who did understand the platform left the company. See Octatrack as an example, where the code is so complicated it's unlikely to see any bigger features released to it now as the original programmer left the company.

Also, consider that many people have experience with general computing platforms while not having special expertise about particular chips, this solution on a general platform can help more people get started, learn more and maintain their own setups with skills that easily translate as well.

> Is real-time Linux not overkill for most synthesisers? From the outside looking in, it looks like it would add more overhead and complexity than solve problems

Probably, but probably not. Music making is more about having options available and use those in creative ways. Being able to use a general computing platform opens up more options than specially programmed OSes for particular instruments. Being able to revive old synths with new software would be wonderful.


Interesting, I was actually looking into distros set up for audio recently because I was hoping to build a little jukebox r-pi or similar but noticed you'd have to recompile the kernel on the official pi distro to get full support for high-resolution audio


"high resolution audio" is a bit of a weasel phrase.

My RPi4 is quite happy doing what any audio engineer would consider "high resolution audio", no kernel recompile required.


192 khz/24 bit


Just put a hifiberry card in there. Done.


I already have a DAC that can do it, but I was hesitant about trying to use the pi after seeing this forum thread https://www.raspberrypi.org/forums/viewtopic.php?t=25810


That thread is rather old, and seems to be concerned with using HDMI, which I don't think anyone concerned about high resolution audio should be doing. The last comment confirms "working fine" back in 2013, more than 7 years ago.


This is awesome, this has been one of my bucket list projects. Audio on most devices sucks \.


It would be awesome, if they could integrate with an existing AVB network, i.e. just use one network port of the device as AVB port


I haven't looked around very much, but getting VSTs on something other than a desktop DAW has been something I've wanted for a while, to enable audio from a dumb keyboard, Without a dedicated (limited) synth or a full desktop/laptop. This looks like it supports it on an RPi, which is great! I'll have to look into this.


Reminds me a lot of the Bela project [1]. The idea is that they expose the raw audio device interrupt handler so applications can do DSP processing without the kernel being involved at all. Wonder if this is doing something similar?

[1] https://bela.io/


Bela does not expose the audio device interrupt to applications. Your code runs in the Xenomai RT kernel, which functions somewhat like a hypervisor. That means that your RT code doesn't run as part of Linux at all. It also means that you need to use Xenomai-specific techniques if you need to communicate with regular Linux application code.

You'd get more or less the same thing by writing a kernel module for Linux, which would also have to use somewhat specialized techniques for interacting with user space application code.


I stand corrected - thanks for the clarification.


> Working with our 5G partner Ericsson, Elk will open up completely new products and network services to connect musicians and equipment remotely over the internet in real time.

This seems difficult. What is the 5G latency chain like in practice?


My guess is that they don’t mean audio.


I think they do - 5G seems to have the prospect of some realtime low latency guarantee features (though I don’t think it’s rolled out yet). With this in place and an RT audio device you could build an all in one remote jamming box without having to insist on wired internet or deal with the wifi jitter. Currently the user’s network is often the bane of things like Jamulus that try to do this in software + your internet connection


Could I use this with PTP clock synchronization to create my own wireless speakers?


Elk seems to be pushing the RPi as the h/w platform. You will likely have a difficult time finding an RPi with an ethernet chipset capable of correctly doing PTP.


I don't think you need the sub-microsecond resolution PTP compatible PHYs offer. I was hoping to get away with software timestamping which I believe should still give you sub ms synchronization.

My real question, I guess, is how in sync do the speakers need to be? Just on the order of human perception, or do we need them to be in phase?


I have some aging Slim Devices units in my house system that I've been contemplating replacing with a custom system. However, I would just use wifi to an RPi with a suitable DAC installed. There'd be no per-speaker sync, since the DAC->speaker connection is analog, and before the DAC there'd be essentially streaming code that would always be handling both channels.


If I get it right this is digital effects controlled by analog controls.

Whereas the holy grail of the audio world would be analog effects controlled digitally :)


> If I get it right this is digital effects controlled by analog controls.

Yes, but also more. For me, the greatest feature is the ability of running VST plugins without having to use a full computer/laptop, since my setup doesn't include any traditional computers.

> Whereas the holy grail of the audio world would be analog effects controlled digitally :)

That's been done for a long time already, bunch of Elektron Analog machines does this already, just one example.


Yes, I also know about Chase Bliss pedals, I think they use midi.

It's been done but it's not really "mainstream" and standardized.

Imagine a standard where all your pedals etc can be recalled / automated from another (central) unit, with a protocol that the hardware can adapt to, motorized faders, rotary faders with leds, or basic components etc.

Maybe that's what Wes Audio is doing with the gcon protocol that has been linked below.


Motorized faders and rotary encoders would be nice, but also overkill. Most of my gear, even analog like Moog Sirin supports MIDI Program Change messages, so this already works today.


The holy grail would be analog effects controlled digitally with a direct analog control layer. Wesausio would be an example for this [1]

[1]: <http://wesaudio.com/_prometheus/>


That's great! I wonder why it's not done more, why there is no kind of standard for this? I looked into it and it's complicated in the electrical side etc, also I guess costs.

Recall time is what forced a lot of professionals to totally move in the box.


mix:analog do that: https://mixanalog.com


Can I use this with commodity audio interfaces? The website does not state that, as far as I could find.


I remember Ubuntu studio being geared towards musicians. Does Elk have its own proprietary components? I noticed that the raspberry pie image is noted as open source but not the OS in general. https://ubuntustudio.org/tour/audio/


Looks like it uses a "Yocto" script - https://github.com/elk-audio/elkpi-sdk - https://www.yoctoproject.org/

I thought of Ubuntu Studio as well, but it looks like Elk is intended to be more lower-level / for appliances. I'm not totally sure though, but I'm guessing it doesn't come with the full suite of Ardour etc.

I do wish there was something like Ubuntu Studio, except focused on audio-only (and more polished / supported). Mostly just because JACK + low-latency kernels are an ordeal to set-up and maintain on a normal desktop, so it's much easier to just install a pre-configured distro on a studio computer and be ready to jam whenever inspiration strikes. Unfortunately KXStudio as a separate distro isn't around any more. I still use a KXStudio 14.04 install on an old (airgapped) DAW computer.


> Mostly just because JACK + low-latency kernels are an ordeal to set-up and maintain on a normal desktop

I wouldn't call linux-rt from the AUR too hard to install, but I can see how it would be a pain for someone who just wanted to pick up their software and do their thing without a long setup. (Since there would probably be more things to setup aside from just the realtime kernel)


Yeah, JACK / ALSA / PulseAudio is probably the harder thing to straighten out, and if you are using the computer in normal usage as well switching back and forth is a perpetual head-ache filled with edge cases and incompatibilities.

A turn-key audio-only OS like what KXStudio 14.04 was truly perfect for pro audio, since it had Ardour + 100s of plug-ins and tools all pre-installed and configured.


1 sample latency or it didn't happen.


Cookie consent by scrolling doesn't seem like a legitimate way to obtain user consent.


I mean, it's no worse than the standard antipatterns, which are designed to be so easy to click "accept" and so hard to follow the "don't accept" flow. None of them are about coming to an agreement with the user; all of them are about obtaining as much data as possible. Until the EU mandates a handwritten letter to obtain permission for tracking cookies, these demonstrations of bad faith will be our friends on the internet.


I just hit agree and let my browser extensions handle the rest by blocking tracking cookies.


[flagged]



Hahah.. Funny to see this same comment about dropbox ($10B company)


The same Dropbox that once had a broken login form where a blank password let people in to any account?

Somebody's effort being successful isn't necessarily a good thing. See Facebook.


But it illustrates that "that's easy" or "I can do that by setting up Linux with an X and Y and router rules Z and a dynamic IP Q" isn't the same as user friendly and isn't a bar to commercial success.


Why is everything about commercial success?

As for easy, this is generally achieved by installing the relevant packages, in whatever distribution you're using.

True for at least Debian, Ubuntu, Arch.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: