Archive for May, 2013
Duetto (C++ for the Web): comparison to asm.js and other clarifications
Posted by Alessandro Pignotti in Leaningtech on May 28, 2013
Around a month ago we posted a first overview of Duetto: our integrated LLVM-based solution for programming both the client and the server side of Web applications using C++. We have been completely overwhelmed by the interest generated by that post!
We tried to keep track of the many insightful comments and constructive criticism we received on this blog and on aggregators. Here are the most common questions that arose:
1) Why didn’t you compare Duetto with asm.js?
We believe that comparing to asm.js-enabled code is not an apples-to-apples comparison. In the current state the Firefox asm.js Ahead-Of-Time compiler can only be enabled on code which is generated using the emscripten approach to memory allocation, so code generated by Duetto has no way to benefit from it. We are open to discuss how the asm.js approach of validating typed-ness of code AOT can be extended so that Duetto, and potentially other solutions, may benefit from it as well.
Still, it is clearly very interesting to compare Duetto performance to asm.js, so we have made a new round of benchmarking that includes it. The results are pretty interesting. As you can see, although asm.js is always faster than emscripten, there are cases in which Duetto on V8 outperforms asm.js (on Spidermonkey) as well. We think that this is caused by the highly efficient handling of objects that V8 provides and shows that there is no clear winner. It is very hard to predict what approach will perform better, and by how much, on real world scenarios.
2) Emscripten is efficient because it effectively disables the Garbage Collector. How does Duetto handle dynamic memory?
Duetto maps C++ objects to native JS objects. This of course means that, every now and then, the GC will run. Emscripten handles memory by pre-allocating all the needed memory and then assigning slices of it using a malloc implementation. So yes, since the memory is not really dynamic it can avoid all the performance impact of the garbage collector
But this advantage does not come for free when you include memory consumption into account. Memory is a resource shared will all the other applications on the system, both native ones and other web apps. Eagerly reserving memory can have a large effect on the performance of the system as a whole.
While the GC may cause a sizeable overhead in terms of CPU cycles, it also decreases memory usage by freeing unneeded memory. Generally speaking, even when programming on traditional platforms, such as x86, dynamic memory allocations and deallocations should be avoided in performance sensitive code. Pre-allocating the required memory directly in the application code is indeed possible with Duetto as well.
Since I do not expect native platforms (i.e. GLibc) to preallocate gigs of memory at application startup to make it faster when dynamic memory is actually used, I do not expect this from a compiler for the JavaScript target either.
3) Is Duetto code interoperable with existing JS code, libraries and vice-versa?
Yes, absolutely. Methods compiled by Duetto can be exported using the regular C++ mangling or even without the mangling if they are declared as “extern C”. In both cases the can be freely invoked from pure JavaScript code. At the same time Duetto compiled code will be able to seamlessly access all the HTML5/DOM APIs and any JS function. The only requirement is that the JS interface needs to be declared in a C++ header.
We currently automatically generate the header for all the HTML5/DOM APIs using headers originally written for TypeScript. We also plan to write headers for the most popular JS libraries, but we would like to stress that the needed headers are just plain C++ class declarations and no special additional support in the compiler is required to support new APIs or new libraries. So when the newest and greatest Web API comes out anyone will be able to write the header to use it with Duetto.
4) Why don’t you release immediately?
Because we want to make the developer experience sufficiently polished before giving it to users. Anyone how had the misfortune of using buggy compiler will probably appreciate this. We plan to release it in 6 months from the previous post (so around 5 months from now, time flies) under a dual licensing scheme: as Open Source for open source and non commercial projects, and with a paid license for commercial development.
5) Will Duetto make reverse engineering of original code easy?
In Duetto, C++ code passes through the regular Clang/LLVM pipeline before being converted to JavaScript. This is the same path used to compile assembly for any other architecture, like x86 or ARM. The generated JavaScript is no easier to understand than the corresponding machine code or assembly. Moreover, one of the typical optimization steps for JavaScript, which we can employ as well, is ‘minimization’ which destroys any residual information about variables and method names.
6) Will other languages will be supported?
We choose to start from C++ because it’s a language that has proven to be good enough for very large scale projects and also because it’s the one we know best and love. Currently we have no plans to expand the Duetto architecture to other languages, but it is definitely possible that in the future, based on user demand, we could bring our awesome integrated client/server development model to other languages as well.
Duetto (C++ for the Web): comparison to asm.js and other clarifications
Posted by Alessandro Pignotti in Leaningtech on May 28, 2013
Around a month ago we posted a first overview of Duetto: our integrated LLVM-based solution for programming both the client and the server side of Web applications using C++. We have been completely overwhelmed by the interest generated by that post!
We tried to keep track of the many insightful comments and constructive criticism we received on this blog and on aggregators. Here are the most common questions that arose:
1) Why didn’t you compare Duetto with asm.js?
We believe that comparing to asm.js-enabled code is not an apples-to-apples comparison. In the current state the Firefox asm.js Ahead-Of-Time compiler can only be enabled on code which is generated using the emscripten approach to memory allocation, so code generated by Duetto has no way to benefit from it. We are open to discuss how the asm.js approach of validating typed-ness of code AOT can be extended so that Duetto, and potentially other solutions, may benefit from it as well.
Still, it is clearly very interesting to compare Duetto performance to asm.js, so we have made a new round of benchmarking that includes it. The results are pretty interesting. As you can see, although asm.js is always faster than emscripten, there are cases in which Duetto on V8 outperforms asm.js (on Spidermonkey) as well. We think that this is caused by the highly efficient handling of objects that V8 provides and shows that there is no clear winner. It is very hard to predict what approach will perform better, and by how much, on real world scenarios.
2) Emscripten is efficient because it effectively disables the Garbage Collector. How does Duetto handle dynamic memory?
Duetto maps C++ objects to native JS objects. This of course means that, every now and then, the GC will run. Emscripten handles memory by pre-allocating all the needed memory and then assigning slices of it using a malloc implementation. So yes, since the memory is not really dynamic it can avoid all the performance impact of the garbage collector
But this advantage does not come for free when you include memory consumption into account. Memory is a resource shared will all the other applications on the system, both native ones and other web apps. Eagerly reserving memory can have a large effect on the performance of the system as a whole.
While the GC may cause a sizeable overhead in terms of CPU cycles, it also decreases memory usage by freeing unneeded memory. Generally speaking, even when programming on traditional platforms, such as x86, dynamic memory allocations and deallocations should be avoided in performance sensitive code. Pre-allocating the required memory directly in the application code is indeed possible with Duetto as well.
Since I do not expect native platforms (i.e. GLibc) to preallocate gigs of memory at application startup to make it faster when dynamic memory is actually used, I do not expect this from a compiler for the JavaScript target either.
3) Is Duetto code interoperable with existing JS code, libraries and vice-versa?
Yes, absolutely. Methods compiled by Duetto can be exported using the regular C++ mangling or even without the mangling if they are declared as “extern C”. In both cases the can be freely invoked from pure JavaScript code. At the same time Duetto compiled code will be able to seamlessly access all the HTML5/DOM APIs and any JS function. The only requirement is that the JS interface needs to be declared in a C++ header.
We currently automatically generate the header for all the HTML5/DOM APIs using headers originally written for TypeScript. We also plan to write headers for the most popular JS libraries, but we would like to stress that the needed headers are just plain C++ class declarations and no special additional support in the compiler is required to support new APIs or new libraries. So when the newest and greatest Web API comes out anyone will be able to write the header to use it with Duetto.
4) Why don’t you release immediately?
Because we want to make the developer experience sufficiently polished before giving it to users. Anyone how had the misfortune of using buggy compiler will probably appreciate this. We plan to release it in 6 months from the previous post (so around 5 months from now, time flies) under a dual licensing scheme: as Open Source for open source and non commercial projects, and with a paid license for commercial development.
5) Will Duetto make reverse engineering of original code easy?
In Duetto, C++ code passes through the regular Clang/LLVM pipeline before being converted to JavaScript. This is the same path used to compile assembly for any other architecture, like x86 or ARM. The generated JavaScript is no easier to understand than the corresponding machine code or assembly. Moreover, one of the typical optimization steps for JavaScript, which we can employ as well, is ‘minimization’ which destroys any residual information about variables and method names.
6) Will other languages will be supported?
We choose to start from C++ because it’s a language that has proven to be good enough for very large scale projects and also because it’s the one we know best and love. Currently we have no plans to expand the Duetto architecture to other languages, but it is definitely possible that in the future, based on user demand, we could bring our awesome integrated client/server development model to other languages as well.