Hacker News new | past | comments | ask | show | jobs | submit login
Songbird: Spatial Audio Encoding on the Web (google.github.io)
201 points by runesoerensen on Aug 16, 2017 | hide | past | favorite | 42 comments



Hello everyone, thanks for checking out the new repository. I've resolved the CDN link issues, but feel free to file any more issues at https://github.com/Google/songbird/issues. Looking forward to seeing all the great stuff you all make with it. :)


Thanks! I'm confused about what this is, exactly, from the description:

> Songbird is a JavaScript API that supports real-time spatial audio encoding for the Web using Higher-Order Ambisonics (HOA). This is accomplished by attached audio input to a Source which has associated spatial object parameters. Source objects are attached to a Songbird instance, which models the listener as well as the room environment the listener and sources are in. Binaurally-rendered ambisonic output is generated using Omnitone, and raw ambisonic output is exposed as well.

My confusion is that The Web Audio API [1] also supports real-time spatial audio for the Web [2]. It looks like Ambisonics is a format that encodes spatial audio into a fixed set of audio channels, rather than just playing audio into PannerNodes directly.

Some questions: (1) Is Songbird indeed an alternative to the PannerNode API like I'm suspecting? (2) If so, why would you want to downmix your audio into a set of intermediate channels, rather than play each source directly into a PannerNode? (3) Is there any advantage to using Omnitone, which I suspect does the HRTFs, rather than using a PannerNode and its HRTFs directly?

[1] https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_A...

[2] https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_A...


Hi Toomin,

Thanks for your interest! Let me try to clarify and answer your questions:

1. Songbird is indeed an enhanced alternative to PannerNode.

2. It internally works with ambisonics, but outputs stereo (we use Omnitone internally to render the multichannel audio down into a stereo track).

The general reason people use ambisonics instead of direct HRTF rendering is because ambisonics allows for head rotation prior to rendering, so the user can easily turn their head, etc. without you having to adjust all the incoming sources' hrtfs.

The reason we feel Songbird is an upgrade to PannerNode is three-fold:

One, you can control the quality of the localization/spatialization effect by adjusting ambisonicOrder (1st to 3rd, atm).

Two, PannerNode is costly... 2 convolutions per source, while songbird is a fixed number of convolutions irregardless of the number of sources, so it ends up allowing you to get more for less.

Three, PannerNode doesn't support any sort of room modelling and Songbird produces spatialized (ambisonic) room reflections and reverberation.

Hope this helps clarify things! :)

Cheers, Drew


That's exactly the information I needed! Thank you! It would be great to have this on the homepage.


Noted! Will add a better explanation to the README. :)


Should Songbird users be concerned about Creative's spatial audio patents? Will Google provide legal counsel if I get sued for using Songbird to provide spatial audio in a game?


Eli, I think you started with a reasonable question, but then followed it up with an unnecessarily passive aggressive question.


It's a valid concern though. People have been sued in the past for patent infringement resulting from using libraries provided by Google: http://www.x-plane.com/x-world/lawsuit/details/


not when you know that Creative sued some companies to death : https://en.wikipedia.org/wiki/Aureal_Semiconductor

"On March 5, 1998 Creative Labs sued Aureal for patent infringement. Aureal countersued because they believed Creative was guilty of patent infringement. After numerous lawsuits Aureal won a favorable ruling in December 1999,[1] which vindicated Aureal from these patent infringement claims, but the legal costs were too high and Aureal filed for bankruptcy. On September 21, 2000, Creative acquired Aureal's assets from its bankruptcy trustee for US$32 million. The purchase included patents, trademarks, other property, as well as a release to Creative from any infringement by Creative of Aureal's intellectual property including A3D. The purchase effectively eliminated Creative's only competition in the gaming audio market. It also eliminated any requirements for Creative to pay past or future royalties as well as damages for products which incorporated Aureal's technology."

Aureal's tech was based on HRTF which is a different (now expired IIRC) patent than Ambisonics though;


Everytime I see something like this rolling out of Google, all I can think is "if only this was a real project, and not yet-another-Google-shop-project". And no, that's not fair towards the people who make it, but then the entire thing is clearly marked as copyright Google, not copyright the people who deserve the credit, so Google doesn't even want me to think of this as something cool made by cool people, but another library pumped out by Google for the betterment of a market position.


What's bothering you? Google apparently allows people to make awesome stuff during business time, and then released it under Apache License 2.0, which seems to be a pretty permissive license (https://github.com/google/songbird/blob/master/LICENSE)

It may not get the attention it deserves, but no project is guaranteed to have a maintainer-for-life.


Mostly the part where it stays a Google project, and despite Apache licenses, no one can contribute without signing a CLA that discriminates against any developer without a phone or a permanent address. "The lawyers insisted" means I don't like the way you think you're releasing useful code into the world when you're really presenting something you locked down so hard people need to give you their direct personal information before you let them help make it better (Facebook does the same thing, which is why I don't contribute to their projects despite loving quite a few of them and using them daily).


>allows people to make awesome stuff during business time

Isn't that R&D? A lot of comapnies do it.



Yea...I thought that was gaining a second life. :(


Such a solid media player for it's time.


Hmm. None of the examples are working for me. "The HTMLMediaElement passed to createMediaElementSource has a cross-origin resource, the node will output silence." in the console.

macOs Sierra 10.12.5 Firefox 55 64-bit


Didn't have this issue in latest Chrome on OSX.

The current demo URLs use `rawgit.com` but reference the script at `cdn.rawgit.com`, so maybe try and prefix the URLs with "cdn" to avoid cross-origin resources altogether? For instance https://cdn.rawgit.com/google/songbird/master/examples/room-...

Edit: Fixed since https://news.ycombinator.com/item?id=15032253


Worked OK on Microsoft Edge.

The spatial effect is not that bad, and that is speaking as someone who (1) loves 5.1 and other surround in games, movies and music, and (2) is usually unimpressed with headphone surround.


:)


Same. No examples work on either FF or Chrome, Fedora 25. Audio from both browsers works perfectly fine for other things.


No luck on iOS. Not sure if it claims to be supported.


Hi Ben174,

I'm aware of the issue and I hope to have it resolved within the day. :)

Cheers, Drew


Worked great for me with Vivaldi on Ubuntu 17.04.


Same.


This is great to see! I have a side project to port the old Peep Network Auralizer to a web based project[1] and the code already has the concept of mono sounds originating at different points in a 3D space, so this should be relatively straightforward to integrate. I was aiming for Android/iPhone compatibility though; is there any multichannel audio support on mobile yet or would I need a fallback?

[1] https://github.com/jhurliman/webpeep


Hi!

Songbird renders stereo-out using Omnitone internally, so Android/mobile is certainly supported. Feel free to file any issues you have at https://github.com/Google/songbird/issues.

Cheers, Drew


How is this better than PannerNode, which is built into HTML5? I can hear the difference in the demo, but can someone put words to it?


I think part of it is room simulation. I might be wrong, but I don't think PannerNode supports things like the room materials like in the 2nd demo


To properly enjoy ambisonics you need a system with multiple loudspeakers like this https://www.york.ac.uk/media/electronic-engineering/postgrad...


FreeSWITCH can do this in mod_conference using OpenAL, its pretty sweet.


I've had lots of trouble with WebAudio and rendering to a file as output, though that was in the infancy of the relevant technology. I can't seem to find any information on this for Songbird nor Omnitone, though, but if it works it opens up a bunch of possibilities.


Had my headphones on backwards and got confused when I moved the S and L around. The S was obviously the "Subject"/me, but I couldn't figure out what the L was for. I settled for "Loud thing".


Haha. Actually, you might have had your headphones on correctly all along. S was "Source" and L was "Listener". I've added some clarification to the examples. Thanks for testing it out!


Are there any FOSS tools for capturing custom HRTFs? With photogrammetry for instance? I found a paper using a Leap Motion for a custom HRTF capture, but they didn't publish any code

https://home.deib.polimi.it/antonacc/pubs/ICASSP_16_2.pdf


Woah!!! This is perfect for a project I'd like to do....


ahh, this is like the thing in realtec audio mixer that lets me change the room type, but in my browser and in realtime


This is neat!

Can you share what techniques you are using for reverberation processing? Early and diffuse reflections?


We're using a room acoustics model that captures early and late reflections based on the acoustic properties (dimensions and materials) of the room. :)


Worth mentioning that this is just a box-shaped room. How many early reflections are calculated?


Hi Science404, yes indeed we're launching with the standard shoebox for now, but obviously we're thinking about the future too. :) Currently we calculate listener-based 1st-order reflections, optimized for performance. Once the ecosystem out there gets faster, we can explore fancier methods. ;) You can see this in EarlyReflections.js

Cheers, Drew


I'm so glad ambisonics is finally getting its day.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: