To others trying to use this demonstration:
Be patient with the page, as it has to download quite a few JAR files in order to be ready to compile and run any test code. To verify that the javapoly is ready to run your code, open the web developer console and wait for it to say "Java Main started". Additionally, first-time compilation will also need to download additional JARs for the demo code to run. Generally speaking though, the compilation time is a good 5-10 seconds on my computer for the demo code, so be sure to not just spam the compile button like I did :)
----
Just a friendly note to the author of this page:
The total download size of this page's underlying Java resources are much larger than I realized (maybe a collective 30-40MB). Given that I waited almost 5 minutes after clicking "Compile & Run!" for nothing to happen (because I had clicked it well before the runtime JARs had fully downloaded), I would suggest to at least add a progress bar or something to let users know when they can actually compile and run the demo code, in addition to some acknowledgement that the code is compiling. This is definitely an interesting project though.
Yeah, Doppio needs to download the entire JDK to run, since it's a full-featured JVM (JRE for running Java programs, JDK for javac so it can compile them). Compressed, it's ~30MB.
The authors of this project can preload the JDK with a progress bar and stash it into IndexedDB, since Doppio's file system, BrowserFS, supports arbitrary backends [1].
If the authors compressed the JAR files into a single bundle, storing the data into a BrowserFS-created IndexedDB file system removes decompression/extraction overhead on subsequent visits.
If they didn't.... then it might be sufficient, but I am not sure how well the browser cache handles large files! How long until it evicts them? Is there an upper bound on the size of cached items? etc.
I am only familiar with Firefox, where the default is 43.75MiB, anything larger than that is not cached.
There's a `browser.cache.disk.max_entry_size` setting that defaults to 51200 (50MiB), however the code also has an explicit override that no item larger than 1/8 of the total cache size (as dictated by `browser.cache.disk.capacity`, default 350MiB) is ever cached, hence the 43.75MiB limit.
All these settings can be changed, obviously, but I suspect few people ever touch them.
I assume other desktop browsers have similar default limits, while mobile browsers probably have a much lower threshold.
Lead author of Doppio here! This is quite cool. I don't recognize the authors of this work, so I was unaware of the project and was surprised to see it on the front page of HN!
If you find any issues with Doppio or have any requests, feel free to open up an issue on our GitHub issue tracker.
Numeric support. Direct support for 64-bit integers would enable
languages to efficiently represent a broader range of numeric types
in the browser. The DOPPIOJVM uses a comprehensive software
implementation of 64-bit integers to bring the long data type into
the browser, but it is extremely slow when compared to normal
numeric operations in JavaScript.
Thanks! I hope Doppio can become useful to many more people. :)
There is a proposal for value types [0] that Niko Matsakis [1] and Brendan Eich [2] talked about a couple of years back and could be used to implement 64-bit numbers in JS, but I'm still waiting for an implementation or a more complete proposal. Looks like it might be dead in the water, as the proposal hasn't been edited in quite some time, but I do not have my finger on the pulse of browser standards so I could be wrong.
You could use (the compiled and optimized version of) the implementation we have in Scala.js: https://github.com/scala-js/scala-js/blob/master/library/src...
It is significantly faster, especially for arithmetic operations, and dramatically so for division, remainder, as well as `toString()` (up to 100x speedup). It also contains the unsigned variant of operations (for java.lang.Lond.divideUnsigned for instance).
Oh, nice! I'll take a look at that when I next get the chance. I assumed that Closure's Long library, which is based on GWT's Long implementation, would be well-optimized, and never looked into an alternative.
Haha, I've actually tried to do this. The only missing link, I believe, is to add a Nashorn/Rhino backend to BrowserFS (Doppio's file system library) so that Doppio can access the filesystem that Rhino is running in to load the JDK.
If that sounds like fun, I take pull requests [0]. :)
I have an experimental version of chromium running with the JVM somewhere on my HD. It's pretty easy to do with CEF/JCEF.
Two thoughts:
I always imagined something like this should use the <object src="app.jar" /> tag. If you consider that someone could build a browser again with java in it, (like I have) then it would be good to design a polyfill such that, it allows for a native version to exist (and then doesn't run)
Also using the same mimetype for jars, classes and source files seems non optimal.
It would be really good if gwt,doppio,teavm all agreed on a single api facade for targeting the browser API. I think that would really put some steam behind the underlying idea here.
> It would be really good if gwt,doppio,teavm all agreed on a single api facade for targeting the browser API.
Unfortunately, that is impossible due to differing requirements; I've actually talked with the teavm and bck2brwsr folks [0]. Doppio requires asynchronous function calls to support preemptive multithreading. TeaVM, GWT, and bck2brwsr all map Java methods directly to synchronous JavaScript methods, preventing them from supporting multithreading.
This is actually one of the larger differences we describe in the academic paper that sets Doppio apart from those projects and Emscripten [1].
Interesting! A cursory glance at the source code looks like they are doing some form of stack splitting at synchronization points, denoted by method annotations. Note that they reimplement the class library and eschew compatibility with traditional Java programs, so I'm not completely sure how general purpose their threads are. I tried finding some writeup about them, but I suspect the code is the documentation. :)
I haven't checked in with this project since we last talked a few years ago, so it's nice to see notable progress! Thanks for the pointer.
One of Doppio's explicit goals was to leverage existing resources in the browser to bring conventional programming languages to the web on top of JavaScript. Using WebAssembly would prevent Doppio from using the browser's garbage collector; we would have to write our own. It would also prevent Doppio from mapping JVM objects onto JavaScript objects.
The WebAssembly standard is constantly evolving, and now contains some ambiguous statements regarding the ability to take advantage of the browser's GC [0]. Considering WebAssembly's current focus on C/C++ code, I do not believe this will come to fruition anytime soon. If it does, I do not see how it would noticeably improve Doppio's current performance, which is bottlenecked primarily by its interpreter. A contributor is working on a JIT right now to make execution faster [1].
This should have a DK: prefix= data plan killer. I was browsing on my phone and realized that it was downloading a lot of libraries. It can be especially a problem when data roaming in Europe.
Huh - could this run existing Java tools like Google Closure Compiler in the browser? That would be handy for simplifying distribution, especially since the Java installer started bundling crapware.
BTW: I got lots of errors in the Firefox console saying 'Error: Assertion failed: A non-running thread has an expired quantum' although the example works (should have a progress bar :))
The Doppio authors (me) are completely uninvolved; this is an independent effort! This project makes it a bit easier to integrate Doppio and use it out-of-the-box.
I suspect the assertion failure is caused by some of their modifications to Doppio's start up code to pause and extend the main thread's runtime.
> What is the difference of this to doppio? A new approach from some of the doppio authors?
(I contribute to both Javapoly and Doppio)
Javapoly tries to make Doppio easier to use (in my subjective opinion):
* easier loading of jars, classes and Java source code
* a promise based async interface to Java methods
* automatic marshaling of primitive values between JS and Java lands
* a proxy based interface into the Java namespace.
As a very crude analogy: shells and editors make it easy to use the filesystem. But we can't contribute the shell / editor to the filesystem! They sit in different layers of the stack.
It's nice to see this work taking Doppio into new places. For an overview of Doppio, BrowserFS and the DoppioJVM, here's a video presentation mostly based on a talk given by John Vilk at PLDI 2014. Unfortunately, there's no audio, but the slides should be easy enough to follow.
Somewhat in jest let me say, "People are complaining that JavaScript is a terrible language. Great, let's put Java in the browser." Now we have an even worse language to code in for the browser.
Note: jvilk explains the perfectly valid reason for this library in another comment below.
As I understand the Doppio paper, they just simulate JVM threads in synchronous JavaScript. So there does not seem to be added performance benefit at least.
having text/java as MIME type - instead of at least looking up the correct one in Wikipedia makes this implementation look quite uggly on the first sight.
https://en.wikipedia.org/wiki/JAR_(file_format)
Also I cannot imagine a single valid usecase for this.
Which to me appears to be the better way around, since it generates static JS instead of having to compile Java every time, as well as having to set up a JVM and whatnot.
It's better if you are writing new code for the web and want to use a different language, but Doppio is better if you need compatibility with existing JVM code. The goals are different.
The main reason is not present in this implementation, which is that clients could execute arbitrary lower-level code in a separate process which the browser was powerless to make secure.
The main goal of Doppio was never speed -- it was compatibility [0]. It's actually a JVM interpreter at present, leading to noticeable slowdown over a native JVM.
Thus, there are many opportunities for performance improvements!
A recent contributor is starting to add a JIT to Doppio, which is a step in the right direction [1].
The most difficult part (I think) of writing a JVM is the concurrent garbage collector. Is this making use of Javascript's garbage collector to do the heavy lifting? Or do they implement their own GC in Javascript? Also, I'm wondering how they are handling concurrency.
Doppio uses JavaScript's garbage collector. (As a result, it cannot support weak references.) As for concurrency, thread quanta are mapped to JavaScript events, so Doppio can emulate preemptive multithreading. Doppio potentially preempts a "thread" at each function call.
The nitty gritty details are in the PLDI 2014 paper [0]. Some details have slightly changed, though (e.g. DoppioJVM supports JDK8 now).
Actually, those are not sufficient to implement weak references, and have a much different use case! With WeakMap and WeakSet, the keys are weakly referenced, hence the names. If you have a WeakMap, you can't produce a value stored in the map without a strong reference to a key.
You can actually polyfill WeakMap and WeakSet. You can't do the same for weak references.
I misspoke -- you can polyfill WeakSet, but you cannot polyfill WeakMap, since you do not know when you can shrink the map. (But you still cannot use it to emulate weak references, since you need a strong reference to get data out of the map.)
You can use simulated threads with JavaPoly, which allows you to utilize the full Java threading model (including locks). If you enable the native jvm, you get true threads.
The reason you wouldn't want to use WebWorkers as a primitive for building threads is that the javascript memory model doesn't allow for shared memory, so you would take a huge performance hit while crossing the serialization boundary for virtually every memory access. You'd be better off just using the simulated threads at that point.
It's true that SharedArrayBuffer is coming, but there's no way to share objects from what I understand. Since Doppio maps JVM objects to JS objects, it cannot take advantage of shared array buffers to emulate shared memory threads.
Well, if we represented objects with a blob of bytes, we would have to implement our own garbage collector and manage our own heap, as object references would just be a pointer that points into an array somewhere.
For interop with JavaScript, there's a usability difference between a JavaScript object and a blob of bytes, although that could be overcome with an object "mirror" that proxies operations appropriately.
Our approach was to leverage the existing GC and language features that browsers already have.
Actually, Doppio is an artifact from my own research, so there is a very good reason "why" [0]! Basically, if you want to re-use your existing, well-tested code in the browser, it is quite difficult. The browser environment is very different from the environment that most programs expect. Doppio bridges the gap between the environment that these programs expect and the environment that the browser presents, making it possible to bring a full JVM into the browser that can run complicated, unmodified programs.
From a technical because-we-can standpoint this is really cool, but I am having a hard time imagining actual use cases that warrant running a JVM in the browser. For modern web applications it is too slow (judging from your paper), so perhaps abandoned legacy applications?
Yes; the paper mentions CodeMoo.com, which the University of Illinois created independently of us to teach basic programming skills to kids [0]. The file system component is readily used by the Internet Archive for their MS-DOS collection [1]. A number of other instructors approached me regarding using Doppio for in-browser IDEs, but Doppio is somewhat cumbersome to integrate into webpages, and never had the time to dramatically improve its documentation and ease-of-use. I believe that is a more fundamental barrier to using Doppio than its performance, especially since you can run Doppio in a WebWorker to avoid some of the responsiveness issues.
Also, note that Doppio is an interpreter, so there is significant interpreter overhead. Using a JIT compilation approach would improve performance, and a contributor is currently working on a basic implementation. I honestly believe that it could become significantly faster with additional work, but as a single person with other projects, I lack the resources to do this work myself.
I was expecting a routine in JavaScript for a fast implementation of 2d polygonal texture filling. I was disappointed as this was something about loading Java in a browser.
It would require a complete rewrite; I commented previously [0] describing how WebAssembly seems inappropriate for Doppio, and also discussed similar thoughts about SharedArrayBuffer [1]. Also, the main source of slowness is due to the fact that Doppio uses a JVM interpreter and does not JIT; it could be much faster than it currently is with additional engineering.