Archive for June, 2009
The quest for graphics performance: part I
Posted by Alessandro Pignotti in Coding tricks on June 27, 2009
Developing and optimizing Lightspark, the modern Flash player, I’m greatly expanding my knowledge and understanding of GPU internals. In the last few days I’ve managed to find out a couple of nice tricks that boosted up performance and, as nice side effects, saved CPU time and added features
First of all, I’d like to introduce a bit of the Lightspark graphics architecture
The project is designed from the ground up to make use of the features offered by modern graphics hardware. Namely 3D acceleration and programmable shaders. The Flash file format encodes the geometries to be drawn as set of edges. This representation is quite different from the one understood by GPUs. So the geometries are first triangulated (reduced to a set of triangles). This operation is done on the CPU and is quite computationally intensive, but the results are cached, so overall this does not hit performance.
Moreover Flash offer several different fill styles that should be applied on geometry, for example solid color and various kind of gradients. Lightspark handles all those possibilities using a single fragment shader, a little piece of code that is invoked on every pixel to compute the desired color. Of course the shader has to know about the current fill style. This information along with several other parameters could be passed with different methods. More on this on the next issue.
There is one peculiar thing about the shader though, let’s look at a simple pseudo code:
gl_FragColor=solid_color()*selector[0]+linear_gradient()*selector[1]+circular_gradient()*selector[2]...;
Selector is a binary array, the only allowed values are zero or one. Moreover only one value is one. This means that the current fragment (pixel) color is computed for every possible fill style and only afterward the correct result is selected. This may look like a waste of computing power, but it is actually more efficient than something like this:
if(selector[0]) gl_FragColor=solid_color(); else if(selector[1]) gl_FragColot=linear_gradient(); ...
This counter intuitive fact comes from the nature of the graphics hardware. GPUs are very different from CPUs and are capable of cruching tons of vectorial operations blindingly fast. But they totally fall down on their knees when encountering branches in the code. This is actually quite common on deeply pipelined architecture which misses complex branch prediction circuitry, not only GPUs but also number crunching devices and multimedia monsters such as IBM Cell. Keep this in mind when working on such platforms.
The quest for graphics performance: part I
Posted by Alessandro Pignotti in Coding tricks on June 27, 2009
Developing and optimizing Lightspark, the modern Flash player, I’m greatly expanding my knowledge and understanding of GPU internals. In the last few days I’ve managed to find out a couple of nice tricks that boosted up performance and, as nice side effects, saved CPU time and added features
First of all, I’d like to introduce a bit of the Lightspark graphics architecture
The project is designed from the ground up to make use of the features offered by modern graphics hardware. Namely 3D acceleration and programmable shaders. The Flash file format encodes the geometries to be drawn as set of edges. This representation is quite different from the one understood by GPUs. So the geometries are first triangulated (reduced to a set of triangles). This operation is done on the CPU and is quite computationally intensive, but the results are cached, so overall this does not hit performance.
Moreover Flash offer several different fill styles that should be applied on geometry, for example solid color and various kind of gradients. Lightspark handles all those possibilities using a single fragment shader, a little piece of code that is invoked on every pixel to compute the desired color. Of course the shader has to know about the current fill style. This information along with several other parameters could be passed with different methods. More on this on the next issue.
There is one peculiar thing about the shader though, let’s look at a simple pseudo code:
gl_FragColor=solid_color()*selector[0]+linear_gradient()*selector[1]+circular_gradient()*selector[2]...;
Selector is a binary array, the only allowed values are zero or one. Moreover only one value is one. This means that the current fragment (pixel) color is computed for every possible fill style and only afterward the correct result is selected. This may look like a waste of computing power, but it is actually more efficient than something like this:
if(selector[0]) gl_FragColor=solid_color(); else if(selector[1]) gl_FragColot=linear_gradient(); ...
This counter intuitive fact comes from the nature of the graphics hardware. GPUs are very different from CPUs and are capable of cruching tons of vectorial operations blindingly fast. But they totally fall down on their knees when encountering branches in the code. This is actually quite common on deeply pipelined architecture which misses complex branch prediction circuitry, not only GPUs but also number crunching devices and multimedia monsters such as IBM Cell. Keep this in mind when working on such platforms.
Lightspark second technical demo announcement
Posted by Alessandro Pignotti in Insane Projects on June 13, 2009
I’m currently finishing some last cleanups and enhancements before releasing a second technical demo of the Lightspark Project. Much time is passed from the first demo, and the project is growing healty. This release aims at rendering the following movie, selected from adobe demo. The results may not be very impressive. But many things are going on under the hood.
The most interesting feature in this release are:
- GLSL based rendering of fill styles (eg. gradients)
- LLVM based ActionScript execution. Code is compiled just in time
- A few tricks are also played to decrease the stack traffic tipical of stack machines.
- First, although simple, framerate timing
- Framework to handle ActionScript asynchronous events. Currently only the enterFrame event works, as the input subsystem is not yet in place. But stay tuned, as I’ve some nice plan about that.
The code will be released in a couple of more days, or at least I hope so
Lightspark second technical demo announcement
Posted by Alessandro Pignotti in Insane Projects on June 13, 2009
I’m currently finishing some last cleanups and enhancements before releasing a second technical demo of the Lightspark Project. Much time is passed from the first demo, and the project is growing healty. This release aims at rendering the following movie, selected from adobe demo. The results may not be very impressive. But many things are going on under the hood.
The most interesting feature in this release are:
- GLSL based rendering of fill styles (eg. gradients)
- LLVM based ActionScript execution. Code is compiled just in time
- A few tricks are also played to decrease the stack traffic tipical of stack machines.
- First, although simple, framerate timing
- Framework to handle ActionScript asynchronous events. Currently only the enterFrame event works, as the input subsystem is not yet in place. But stay tuned, as I’ve some nice plan about that.
The code will be released in a couple of more days, or at least I hope so
ActionScript meets LLVM: part II
Posted by Alessandro Pignotti in Insane Projects on June 10, 2009
Just a quick update. The nice tricks I’m playing to build a fast ActionScript VM using LLVM are now the topic of my bachelor thesis, the completion of which will still need an handful of months. If you are interested in the development you may follow the git changelog here or contact me privately.
ActionScript meets LLVM: part II
Posted by Alessandro Pignotti in Insane Projects on June 10, 2009
Just a quick update. The nice tricks I’m playing to build a fast ActionScript VM using LLVM are now the topic of my bachelor thesis, the completion of which will still need an handful of months. If you are interested in the development you may follow the git changelog here or contact me privately.