Duetto: A faster and smarter alternative to Emscripten. And more.

We have seen a lot of hype on the Web recently after the announce­ment of Asm.js by Mozilla: a new “extra­or­di­nar­ily opti­miz­able, low-level sub­set of JavaScript”. The sys­tem builds on the work that has been done on Emscripten: a LLVM-based solu­tion which com­piles C++ to JavaScript, allow­ing for an easy port of appli­ca­tions and games to the Web. The excite­ment for Asm.js stems from the fact that, by using a spe­cial vir­tual machine inte­grated in Fire­fox, it can improve the per­for­mance of emscripten-generated code and get it even closer to native performance.

We (Lean­ing Tech­nolo­gies Ltd) would like to intro­duce Duetto, our own LLVM-based solu­tion for pro­gram­ming the Web using C++. And by the Web, we mean both the client and server side of it, but let’s talk about the client side first.

Emscripten han­dles C++ code by emu­lat­ing a full byte-addressable address space. This is def­i­nitely a good solu­tion, but sub­op­ti­mal. Javascript is not based on a byte-addressable address model, but on an object-addressable model: all the acces­si­ble mem­ory is con­tained in some object. But when you think about it, C++ is not that different.

Our solu­tion inte­grates with clang and the LLVM tool­chain and is able to map C++ object-oriented con­structs to native JavaScript objects. It turns out that access­ing objects on mod­ern JavaScript engines is faster than access­ing arrays. By using this (and a few more) tricks we man­aged to get the fol­low­ing, pre­lim­i­nary, results on micro benchmarks.

For each benchmark the best time in 10 runs has been selected. The V8 and Spidermonky JavaScript shells has been used. The respective commits are b13921fa78ce7d7a94ce74f6198db79e075a2e03 and b9d56a1e0a61. *The fasta benchmark has been modified by removing a memory allocation inside the main loop.

For each bench­mark the best time in 10 runs has been selected. The V8 and Spi­der­monky JavaScript shells has been used. The respec­tive com­mits are b13921fa78ce7d7a94ce74f6198db79e075a2e03 and b9d56a1e0a61. *The fasta bench­mark has been mod­i­fied by remov­ing a memory


We man­aged to do this by real­iz­ing that by dis­al­low­ing some unsafe C++ capa­bil­i­ties (such as type unsafe pointer cast­ing and pointer arith­metics inside struc­tures) it’s actu­ally pos­si­ble to cre­ate more effi­cient, smaller and faster JavaScript code from C++. Inter­est­ingly enough, we dis­cov­ered that, in most cases, the needed lim­i­ta­tions on the lan­guage are actu­ally spec­i­fied as unde­fined behav­iour!

So, yes, Duetto does need some min­i­mal port­ing to bring C++ code to the Web while emscripten makes it mostly free. What you get in exchange for that is faster per­for­mance with no need of a spe­cial VM and deep inte­gra­tion with the browser. Duetto cre­ates a really seam­less C++ pro­gram­ming expe­ri­ence for the Web:

  • Seam­less inte­gra­tion with the browser envi­ron­ment, com­plete access to the DOM and HTML5 tech­nolo­gies includ­ing WebGL. You can even access and use your favourite JavaScript library or exist­ing JavaScript from C++ by declar­ing the avail­able inter­faces in the C++ code using a sim­ple convention.
  • Seam­less client/server pro­gram­ming, using trans­par­ent RPCs in sin­gle code­base. The com­piler will split the code auto­mat­i­cally in the client part (com­piled to JavaScript) and server part (com­piled to native code).

The Duetto back­end is already in a very advanced state, and we believe it’s already suit­able to bring the first appli­ca­tions to the Web. Espe­cially games, which are our pri­mary tar­get. Unfor­tu­nately our front end is not yet as pol­ished as we would like, as we want to improve the error report­ing to make the port­ing expe­ri­ence as smooth as possible.

We are not yet ready to release Duetto, but we are eager to start open­ing col­lab­o­ra­tions, so if you are inter­ested in bring­ing your C++ appli­ca­tion or game to the web, feel free to con­tact me (alessandro@leaningtech.com). We believe that in six months or less from now we will be able to release a robust prod­uct, most prob­a­bly capa­ble of gen­er­at­ing even faster code. And we want to release it as open source.

For more infor­ma­tion please visit our site: http://www.leaningtech.com

  • mar­tin

    what about release early? i order to etab­lish a open source pres­ence you should con­sider just open­ing up what you got. good luck with the project any­how, it is very interesting!

  • Scionic­Spec­tre

    This is very inter­est­ing– some addi­tional bench­marks against native C++ may be of inter­est. Of course, the promise of direct inte­gra­tion with mod­ern browser tech­nol­ogy like WebGL makes this a bit more inter­est­ing than just ‘port­ing my app’. It makes C++ seem like an option for web devel­op­ment itself, not just legacy code we need to drag along. I’m def­i­nitely keep­ing an eye on this.

  • l

    Shouldn’t you also com­pare with asm.js? Since that will be the default per­for­mance pro­file for emscripten for firefox.

  • http://www.joshmatthews.net Josh Matthews

    One of the pri­mary ben­e­fits of emscripten’s model is that it elim­i­nates garbage col­lec­tion pauses by using the fixed-size heap and per­form­ing no other allo­ca­tions. How does Duetto address this problem?

  • http://www.syntensity.com/ krip­ken

    1. Looks like the num­bers here are with­out asm.js? With asm.js they should look more like http://kripken.github.io/mloc_emscripten_talk/#/27 In par­tic­u­lar Fire­fox with asm.js should be faster than both Fire­fox and Chrome. (When using emscripten, make sure you com­pile with –O2 –s ASM_JS=1, or just run the bench­mark suite, which does that for you.)

    2. Emscripten and the llvm-js-backend project tried a few approaches sim­i­lar to what Duetto is described as doing here, a while back. I’d be happy to share expe­ri­ences from that if it helps this project. For exam­ple, how you han­dle point­ers to a mem­ber of a struc­ture? (We had var­i­ous approaches to that and sim­i­lar problems.)

    3. In gen­eral, I think your approach will lead to much nicer-looking code and to eas­ier interop with nor­mal JS. How­ever, your per­for­mance will likely not be as fast as the emscripten/mandreel approach, because of (a) GC pauses, (b) lack of full LLVM opti­miza­tions — which require the assump­tion of alias­ing typed arrays to rep­re­sent mem­ory, e.g. write an int, read a byte from inside it, and (c) read­ing a typed array value will, in a fully opti­mized sit­u­a­tion, always be at least as fast as a read from a JS object prop­erty, and in asm.js mode, gen­er­ally sig­nif­i­cantly faster.

    4. asm.js does not require a “spe­cial VM”. It parses, opti­mizes and runs in the nor­mal Fire­fox JS engine, in the Fire­fox imple­men­ta­tion of it. The only dif­fer­ence is that we type check the parse tree and use that to emit type infor­ma­tion into the JS engine opti­mizer, and that we use some sand­box­ing tricks to avoid bounds checks etc. But even that is not nec­es­sary, the v8 team for exam­ple is dis­cussing opti­miz­ing asm.js code using a very dif­fer­ent approach.

  • apig­notti

    1) We believe that com­par­ing to asm.js-enabled code is not an apples-to-apples com­par­i­son. In the cur­rent state the asm.js AOT com­piler can only be enabled on code which is gen­er­ated using the emscripten approach to mem­ory allo­ca­tion, so code gen­er­ated by Duetto has no way to ben­e­fit from it. We acknowl­edge though that there is high value in com­par­ing Duetto per­for­mance to asm.js and we will release an updated graph includ­ing it as soon as pos­si­ble. We are also open to dis­cuss how the asm.js approach of val­i­dat­ing typed-ness of code AOT can be extended so that Duetto, and poten­tially other solu­tions, may ben­e­fit from it as well.

    2) Our basic approach is to model all point­ers as con­sist­ing of two parts: a ref­er­ence to a con­tainer object and an off­set to the pointed object. This is slow, but we make use of it very rarely. Most of the times the com­piler can resolve the access to a sim­ple prop­erty access on the con­tainer object.

    3) The main tar­get of Duetto is to pro­vide an inte­grated C++ cod­ing expe­ri­ence for web appli­ca­tion. One of its main design point is to guar­an­tee trans­par­ent inter­op­er­abil­ity and access to the browser APIs and the DOM. The same trans­parency applies to the mem­ory allo­ca­tion strat­egy in code com­piled by Duetto. Dynamic mem­ory man­age­ment is slow also when pro­gram­ming native code and should be avoided when per­for­mance is the crit­i­cal fac­tor. In such cases the devel­oper may choose to pre­al­lo­cate the required mem­ory. We agree that pre­al­lo­ca­tion is the best option when reduc­ing over­head is the pri­or­ity: in these cases, it should be done at the appli­ca­tion level and not at the compiler/runtime level. More­over, since the objects used by code com­piled with Duetto are man­aged by the reg­u­lar GC the mem­ory foot­print will match the actu­ally used mem­ory closely, which should be good for long lived, com­plex applications.

    4) Sorry, that was indeed mis­worded, what we meant was that asm.js requires spe­cial sup­port from the VM to be opti­mized, while code gen­er­ated by Duetto doesn’t.

  • apig­notti

    In emscripten model garbage col­lec­tion is worked around by pre­al­lo­cat­ing all the needed mem­ory. We believe that devel­op­ers look­ing for the best per­for­mance should avoid dynamic mem­ory and pre­al­lo­cate the mem­ory at the appli­ca­tion level when needed, sim­i­larly to how the issue is han­dled in native C++ pro­grams. Hav­ing the reg­u­lar GC man­age most of the object is actu­ally good since the mem­ory foot­print will be closer to the amount of mem­ory actu­ally needed, reduc­ing the bur­den on the sys­tem when com­plex, long liv­ing appli­ca­tions are used.

  • apig­notti

    We don’t feel that com­par­ing to asm.js is an apples-to-apples com­par­i­son, since the ben­e­fits of Ahead of Time com­pi­la­tion are only avail­able if the sub­set of JavaScript being used is the one gen­er­ated by emscripten. We hope that asm.js restric­tions will be relaxed enough to be use­ful in the code gen­er­ated by Duetto and poten­tially other projects. Still, we acknowl­edge that there is value in com­par­ing with asm.js in the cur­rent state and we plan to release an updated graph includ­ing it as soon as possible.

  • apig­notti

    The data in the graph is actu­ally nor­mal­ized against native exe­cu­tion: the native time for any test is scaled to be ‘1’. Indeed it’s actu­ally pos­si­ble to use Duetto not only to port exist­ing c++ appli­ca­tions but also to write main­tain­able web apps and games. Access to browser APIs, like WebGL, is some­thing that is actu­ally already work­ing and we plan to release a WebGL demo soon.

  • apig­notti

    We want to wait just a bit more to pol­ish up the expe­ri­ence of using Duetto. We’ll try to make the wait as short as possible.

  • http://www.syntensity.com/ krip­ken

    1. We have def­i­nitely dis­cussed how an asm.js-like approach could ben­e­fit com­pil­ers from lan­guages like JSIL, using the upcom­ing Binary Data API. That should help you with Duetto as well, would be great to have your input and feed­back on that. (We hang out in #asm.js in the mozilla IRC, btw.)

    2. A two-part pointer approach was tested in emscripten back in 2010 or so (should all be in git his­tory). It’s true that in small bench­marks you can sim­plify things with opti­miza­tions, how­ever in the gen­eral case it is extremely dif­fi­cult or just impos­si­ble (e.g. through indi­rect calls and with­out strong assump­tions on the project) in our expe­ri­ence. Would be inter­est­ing to see bench­marks of Duetto on larger things (Bul­let, Cube2, etc.) and how you han­dle that. As I say, I am skep­ti­cal, but would be very happy to see inter­est­ing results of course!

    3. Regard­ing mem­ory usage, I would expect Duetto’s to be higher than Emscripten/Mandreel. JS objects have sig­nif­i­cant over­head, while the Emscripten/Mandreel approach gives you the same sizes in mem­ory as in native builds (a struct tak­ing 16 bytes in C takes 16 bytes in JS as well). But it is true that you must have a large typed array for every­thing, and likely part of it will be unused, so Duetto will have the advan­tage there. I guess we can’t be sure which will be bet­ter with­out some real-world tests.

    Aside from mem­ory, though, there is run­time over­head to pass­ing around JS ref­er­ences. You will be going through read and write bar­ri­ers, have GC pauses, etc. Those are the rea­sons that pushed Emscripten away from a 2-part pointer approach (and llvm-js-backend away from their approach which used things like clo­sures) and to the cur­rent model (also shared by Mandreel).

    As men­tioned before, I do see a nice advan­tage of your approach in more read­able code, and eas­ier to inte­grate into nor­mal JS, DOM calls, etc. Just a heads up though, we were plan­ning to improve DOM call inte­gra­tion in Emscripen later this year, it should be close to trans­par­ent for the user once we use a WebIDL bind­ings generator.

  • The Float­ing Brain

    On your state­ment: “dis­al­low­ing some unsafe capa­bil­i­ties from C++ (unsafe pointer cast­ing and pointer arit­mat­ics inside of structs)“
    When you say “unsafe pointer casts”, what pointer casts do you mean in par­tic­u­lar, or are you mak­ing a gen­eral state­ment that all pointer casts are unsafe (will you still be able to use say: dynamic_cast, or ( int* ) from a char*)?
    Also, when you say “pointer arith­metic” inside of structs, do you mean all of it, or some par­tic­u­lar cases, and by struct do you mean “record” in the tech­nichal sense or a more mod­ern C++ type struct (not P.O.D) which can hold meth­ods, over­loaded oper­a­tors, etc?
    What other fea­tures will be con­sid­ered “unsafe”? Could you pro­vide some exam­ples.
    And one more (some­what obscure) ques­tion I pro­pose: what about pro­grams that pur­pose­fully imple­ment unde­fined behav­ior in this way?

    Will dis­al­low­ing these fea­tures be optional?

    I ask these ques­tions because my main rea­son for using C++ is the lan­guage itself and the free­dom it pro­vides the devel­oper with, and although this looks like every­thing I could have dreamed of for a C++ devel­op­ment plat­form (despite the fact that it is not native) I might not use a com­piler that would take that away.