Hello everyone, thanks for checking out the new repository. I've resolved the CDN link issues, but feel free to file any more issues at https://github.com/Google/songbird/issues. Looking forward to seeing all the great stuff you all make with it. :)
Thanks! I'm confused about what this is, exactly, from the description:
> Songbird is a JavaScript API that supports real-time spatial audio encoding for the Web using Higher-Order Ambisonics (HOA). This is accomplished by attached audio input to a Source which has associated spatial object parameters. Source objects are attached to a Songbird instance, which models the listener as well as the room environment the listener and sources are in. Binaurally-rendered ambisonic output is generated using Omnitone, and raw ambisonic output is exposed as well.
My confusion is that The Web Audio API [1] also supports real-time spatial audio for the Web [2]. It looks like Ambisonics is a format that encodes spatial audio into a fixed set of audio channels, rather than just playing audio into PannerNodes directly.
Some questions: (1) Is Songbird indeed an alternative to the PannerNode API like I'm suspecting? (2) If so, why would you want to downmix your audio into a set of intermediate channels, rather than play each source directly into a PannerNode? (3) Is there any advantage to using Omnitone, which I suspect does the HRTFs, rather than using a PannerNode and its HRTFs directly?
Thanks for your interest! Let me try to clarify and answer your questions:
1. Songbird is indeed an enhanced alternative to PannerNode.
2. It internally works with ambisonics, but outputs stereo (we use Omnitone internally to render the multichannel audio down into a stereo track).
The general reason people use ambisonics instead of direct HRTF rendering is because ambisonics allows for head rotation prior to rendering, so the user can easily turn their head, etc. without you having to adjust all the incoming sources' hrtfs.
The reason we feel Songbird is an upgrade to PannerNode is three-fold:
One, you can control the quality of the localization/spatialization effect by adjusting ambisonicOrder (1st to 3rd, atm).
Two, PannerNode is costly... 2 convolutions per source, while songbird is a fixed number of convolutions irregardless of the number of sources, so it ends up allowing you to get more for less.
Three, PannerNode doesn't support any sort of room modelling and Songbird produces spatialized (ambisonic) room reflections and reverberation.
Should Songbird users be concerned about Creative's spatial audio patents? Will Google provide legal counsel if I get sued for using Songbird to provide spatial audio in a game?
"On March 5, 1998 Creative Labs sued Aureal for patent infringement. Aureal countersued because they believed Creative was guilty of patent infringement. After numerous lawsuits Aureal won a favorable ruling in December 1999,[1] which vindicated Aureal from these patent infringement claims, but the legal costs were too high and Aureal filed for bankruptcy. On September 21, 2000, Creative acquired Aureal's assets from its bankruptcy trustee for US$32 million. The purchase included patents, trademarks, other property, as well as a release to Creative from any infringement by Creative of Aureal's intellectual property including A3D. The purchase effectively eliminated Creative's only competition in the gaming audio market. It also eliminated any requirements for Creative to pay past or future royalties as well as damages for products which incorporated Aureal's technology."
Aureal's tech was based on HRTF which is a different (now expired IIRC) patent than Ambisonics though;
Everytime I see something like this rolling out of Google, all I can think is "if only this was a real project, and not yet-another-Google-shop-project". And no, that's not fair towards the people who make it, but then the entire thing is clearly marked as copyright Google, not copyright the people who deserve the credit, so Google doesn't even want me to think of this as something cool made by cool people, but another library pumped out by Google for the betterment of a market position.
What's bothering you?
Google apparently allows people to make awesome stuff during business time, and then released it under Apache License 2.0, which seems to be a pretty permissive license (https://github.com/google/songbird/blob/master/LICENSE)
It may not get the attention it deserves, but no project is guaranteed to have a maintainer-for-life.
Mostly the part where it stays a Google project, and despite Apache licenses, no one can contribute without signing a CLA that discriminates against any developer without a phone or a permanent address. "The lawyers insisted" means I don't like the way you think you're releasing useful code into the world when you're really presenting something you locked down so hard people need to give you their direct personal information before you let them help make it better (Facebook does the same thing, which is why I don't contribute to their projects despite loving quite a few of them and using them daily).
Hmm. None of the examples are working for me. "The HTMLMediaElement passed to createMediaElementSource has a cross-origin resource, the node will output silence." in the console.
The spatial effect is not that bad, and that is speaking as someone who (1) loves 5.1 and other surround in games, movies and music, and (2) is usually unimpressed with headphone surround.
This is great to see! I have a side project to port the old Peep Network Auralizer to a web based project[1] and the code already has the concept of mono sounds originating at different points in a 3D space, so this should be relatively straightforward to integrate. I was aiming for Android/iPhone compatibility though; is there any multichannel audio support on mobile yet or would I need a fallback?
Songbird renders stereo-out using Omnitone internally, so Android/mobile is certainly supported. Feel free to file any issues you have at https://github.com/Google/songbird/issues.
I've had lots of trouble with WebAudio and rendering to a file as output, though that was in the infancy of the relevant technology. I can't seem to find any information on this for Songbird nor Omnitone, though, but if it works it opens up a bunch of possibilities.
Had my headphones on backwards and got confused when I moved the S and L around. The S was obviously the "Subject"/me, but I couldn't figure out what the L was for. I settled for "Loud thing".
Haha. Actually, you might have had your headphones on correctly all along. S was "Source" and L was "Listener". I've added some clarification to the examples. Thanks for testing it out!
Are there any FOSS tools for capturing custom HRTFs? With photogrammetry for instance? I found a paper using a Leap Motion for a custom HRTF capture, but they didn't publish any code
We're using a room acoustics model that captures early and late reflections based on the acoustic properties (dimensions and materials) of the room. :)
Hi Science404, yes indeed we're launching with the standard shoebox for now, but obviously we're thinking about the future too. :) Currently we calculate listener-based 1st-order reflections, optimized for performance. Once the ecosystem out there gets faster, we can explore fancier methods. ;) You can see this in EarlyReflections.js