Hacker News new | past | comments | ask | show | jobs | submit login
That Whole Thing with Sound in In-Browser Emulation (textfiles.com)
81 points by xai3luGi on Jan 11, 2015 | hide | past | favorite | 20 comments



The author bemoans overly complex modern web standards, but surely this is an inevitable result of the web's transition from "an easy way to view documents" to "a safe mechanism for remote code execution". I mean sure, in the old days anyone could learn the important HTML tags in a weekend, but they couldn't use those tags to emulate an MS-DOS game. The only way to enable that was to gradually evolve web APIs that safely exposed all the low-level functionality.

Or, the only way apart from plugins, anyway. In comments the author writes:

> In the future, I suspect some group will do it “right” and create a wonderful plugin or default wrap-in for all browsers that will simply function as the native environment all this material needs. At that point, MESS and DOSBOX will be ported to this...

..which is essentially Flash (which has had solid dynamic audio for years, and console emulators built on it with good audio support (see:nesbox)). In a parallel universe where Flash was an open standard, it'd be exactly what the author is asking for. Except in that universe, Flash would presumably share whatever problems the author finds with web standards.

Unless we want to choose One Browser or One Plugin, I'd say we're better off with complicated, slow-moving standards (like Web Audio), even if they are developed and implemented by adversarial companies.


I agree... as much as Web Standards and their implementation is suboptimal, I remember a time when you were GUARANTEED that the style and layout you wrote for one browser would simply just not work in any other browser. I don't think people remember how bad it was and how much better it is now.

It was like every browser was it's own terrible brand of IE6. Each with their own weirdness that permeated every aspect of the browser.

You ended up having to write as many stylesheets as browsers you wanted to support (ok,ok, netscape, firefox and mozilla suite shared the same stylesheet).

"Cross-Browser" was as unattainable and desired as the Holy Grail, "This site works best in IE6/Firefox/Opera/Safari" was not a suggestion like "This wine pairs best with cheese" but a warning of terrible consequences like "This milk is best before curdled".

Ok, so we're still not 100% cross-browser 100% of the time, but it's a term people don't often need to think about anymore.


You can chart a history of the Web by what you could reasonably do cross-browser.

In the beginning, it was formatted text. That was it. You had bold, underline, italic, and maybe even justification, but images relied on separate helper programs and weren't displayed inline, but in their own windows. And that was assuming the web page and your system had at least one image format in common. (XBM? TIFF? Remember those? I'm not saying they're no longer used, but when was the last time you saw them on the Web?)

By the Here Comes Everybody stage, we'd pretty well gotten inline images and even background images sorted, to the limits of color space (256 color is... better than CGA?) and the few format problems left, but fonts certainly didn't transfer and the only music format which was even remotely feasible was MIDI, which is less an audio format and more like sheet music, in that it has to be played by software instruments on your own system. Interesting concept, but no guarantee of sound quality at the receiving end.

Skipping a bit, we now have inline images sorted out, fonts sorted out, a workable CSS standard pretty well universal, and even video mostly working, despite the political and legal complications.

Now we're annoyed that we can't use our browsers to play games that were written for computers still in use in the Web's early days. 'Progress' is getting annoyed about increasingly interesting and exciting things, it seems.


True, and it's worth noting that it was only midway through this process that people started expecting different browsers to render pages identically. In the early days HTML merely structured the information; styling was assumed to be left up to user preference.


I think the main problem with WebAudio is that it was originally designed under the assumption that Javascript would be too slow to fill audio buffers fast enough for low latency buffer queueing, and thus this complex audio-node system was conceived where audio data would flow through black-box processing nodes written in C/C++.

In a perfect world, a low-level web audio API would focus on providing an as direct-as-possible way to stream raw audio data (buffer queueing like in OpenAL, or a wrap-around buffer like in DirectSound), make sure that this works well with WebWorkers (i.e., worker thread should be able to queue new data when required, not when the main loop gets around to it), and move all the complicated high-level audio-graph stuff into optional Javascript libs.


That's what Mozilla's proposed MediaStream Processing API was: https://dvcs.w3.org/hg/audio/raw-file/tip/streams/StreamProc...

But Google was able to put more engineering effort behind the Web Audio API and get a lot of web sites to use it, so that's what won. The fact that Web Audio's ScriptProcessorNodes run on the main thread instead of in workers is basically a travesty, brought about because they were tacked on as an afterthought to claim "feature parity" with the MediaStream Processing API. Until that's fixed [1], real-time audio generation in JavaScript will always be a joke that only works under ideal conditions where people open just one, well-tuned webpage in their browsers at a time and GC pauses don't exist.

(full disclosure: I work for Mozilla and am not bitter at all)

[1] https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_A... is the proposal to fix it, only three years late to the party


I'm currently trying to get realtime audio streaming over WebSockets to work. I'm encoding audio using libspeex on the server side (an iOS device) and decoding it again on the client using speex.js[1].

Transmitting Speex "frames" (where each frame is a 20ms chunk of audio) over WebSockets and decoding in JS works beautifully. However, I have a really hard time to queue up these 20ms AudioBuffer nodes perfectly, without producing any glitches or pops. I'm not sure its even possible with the current API.

What I would like to have is an AudioBuffer node where I can dynamically append chunks while its playing. Since all(?) source nodes are for one time only use, the browser could free the data that was already played again.

An AudioBuffer that allows modification of the data while it's playing would work too - you could just loop it and use it as a ring buffer. However, my understanding of the spec is, that modifying an AudioBuffer's data is explicitly prohibited once its playing.

[1] https://github.com/jpemartins/speex.js/


You might be interested in looking at what https://github.com/brion/ogv.js does.

Just curious, if your goal is realtime, why aren't you using WebRTC?


Thanks, I'll have a look!

I can't use WebRTC yet because I want to have support for IE and especially Safari & Mobile Safari. I want to add audio for an iOS App[1] that's using jsmpeg[2] for video streaming.

[1] http://instant-webcam.com/

[2] https://github.com/phoboslab/jsmpeg


IE doesn't support Web Audio either. The project I linked above does contain a Flash shim for IE, which you may find helpful.

Any proper realtime audio stack (which ogv.js is not: it targets moderate latency streaming) is going to require a jitter buffer to deal with network/processing latency variation. Writing one is a non-trivial exercise (though not impossibly so).

In any case, I'd recommend using Opus (the audio codec used by WebRTC) over Speex.


I put together a mobile-friendly version of a Javascript NES emulator for a hackathon [1], and I converted it to use the Web Audio API[2]. It works, but sounds very janky. I had to divide all of the integer audio samples into floating point values just so that they could be converted to integer values at some point later on. I also wanted to be able to use a gamepad for the emulator, but the last revision to the Gamepad API was a year ago[3], and browser support is pretty lacking[4]. I really like the idea of browser emulation because native emulators are banned from the App Store, but the web APIs are just not there yet.

Side note: If anyone wants to take over webNES and make it awesome, free free to message me

[1] http://webn.es

[2] https://github.com/conradev/jsnes/commit/6b0ef8d5b5d0a7b17e6...

[3] https://dvcs.w3.org/hg/gamepad/raw-file/default/gamepad.html

[4] http://caniuse.com/#feat=gamepad


I'd be curious to know what's not there.

What's wrong with the Web Audio ScriptProcessor node for example?



I know several people who have been going over the audio issues in browsers. I remember how painful flash's audio was to deal with, and the browser vendors have managed to make something worse... I do hope that web audio gets better (good for that matter).

To really get where people want to go in terms of browser based gaming (even if the idea disgusts you) is going to hinge on 3D video, great audio composition and good controller capabilities.


What's the problem? The Web Audio API is being implemented by IE and is already in FF, Chrome, and Safari (both OSX and iOS). Is there something missing from it?


I used Flash audio back in the day and I hardly found it painful.


I didn't find it painful at all too. Neither did these guys - http://www.audiotool.com


Here's hoping that Spartan not only is on rapid release cycle but it's fork of Trident and Chakra are developed in an open source project (e.g. Gecko, WebKit, Blink) so developers can submit fixes and features and not have to wait a year for them to be available.


There is a temporary work-around, I would think. Presumably the data that games expect to be driving sound comes out in a byte stream, like anything else, expecting to be passed to a side-effect producing function provided by the environment. You wouldn't need to even write the code that processes that data into sound, you could just sample (perhaps on original equipment) and provide sound resource files, and then make sure to play the right sample for any particular set of bytes passed to your implementation. (Even if Web Audio standardizes, it's not clear to me how you would handle essentially arbitrary output-to-be-interpreted-as-sound from an arbitrary process!)


The color choice of that page is insane. My eyes literally hurt after reading the article.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: