Monday, November 1, 2010

Animating the Canvas part 9: Getting Dirty Quicker

When I was first starting to learn to program, I was told three steps to follow in order to create good programs. The first step is to get your program running. The second step is to get it running correctly. The final step, which is often not needed, is to get it running fast. In the old days (the 1990's) programmers were more concerned with speed (100MHz was fast!) so a lot of programmers would optimize their code before the program was even running. Often, the optimizations used would make their code messier making it harder to actually get the program running correctly or at all.

The big trick to optimizing is simply the knowledge that in the normal course of running only a small percentage of the code being ran is responsible for the bulk of the time. If you can find the parts of your program that most of the executing time is taking place in, you can optimize those areas and gain a fair bit of speed. Tools like FireBug make this very easy as they have tools that do the tracking for you.

Once you know what code needs to be fixed, the key question becomes how to speed it up. My advice would be to learn Assembly language (aka Machine code). While you may never need to use it, which is unfortunate as it is great fun to code in, knowing what the machine is actually doing at a low level will give you a much greater appreciation for what is going on with your program. Another technique, which I don't do as much as I should, is to simply use the debugger to single step through the function in question. This can give you a huge insight into what is going on.

BGLayers have two parts that need focus on. First, is the adding of dirty rectangles. Next is the rendering of things. The addDirty function is relatively minor CPU wise, but during my discovery at how poor Canvas Clipping is I noticed a lot of Rectangles were being created needlessly and wanted to reduce this.  Creating a new instance of an object has some overhead to it so avoiding doing so will speed things up. Right at the start of addDirty, a new instance of a rectangle is created. This was done to support null parameters which add the entire layer, but when you consider that the function will call the same function until the parent is reached, this can be very inefficient.

The problem is that a new instance of the rectangle passed has to be created as it has to be altered to assure it is pixel aligned. While this alteration is minor but could muck up the rectangle if it is used by unrelated parts of the program. When we start clipping the dirty rectangle we use recursion to call the function causing many unnecessary rectangles to be created. The solution is to eliminate the recursion. This is easily done by simply having an array with a list of rectangles that need to be clipped. Initially there is only the initial rectangle in this list. If it is clipped, new rectangles get added to the list instead of having to call addDirty. This eliminates the recursion.

With these simple but major changes, the program is run and no noticeable difference in run time. Well, at least I know there will be less garbage collection. My point about not optimizing where there is no need has proven itself as I wasted almost an hour for no gain. Still, I am happier with the code as recursion has been eliminated making the code nicer.

Next, we try to speed up the part that is taking up over two-thirds of the cycles in my tests. Namely the draw self function. This again leads me back to ancient times (the 90's) when the big bugaboo in graphics programming was  drawing pixels multiple times per frame. In those times, accelerated graphics cards were hit or miss and accessing video memory was costly. Things have changed today but eliminating unnecessary drawing is still a good idea. In fact, this was planned from the beginning and is why there is a solid variable to let the renderer know if a layer is solid. Anything underneath a solid layer is not going to be seen so if the portion being drawn is contained within the solid layer, only that layer and any layers above it need to be drawn. This is done in the render function, which the drawSelf function is called from. The rendering code has been changed to only call drawSelf if no solid layers contain the bounds being rendered.

    for (cntr = this.children.length - 1; cntr >= 0; --cntr) {
        if ((this.children[cntr].solid) && (this.children[cntr].findRealPosition().containsRect(bounds))) {
            startDraw = cntr;
    if (startDraw == -1) {
        this.drawSelf(ctx, bounds);
        startDraw = 0;
    for (cntr = startDraw; cntr < this.children.length; ++cntr)
        this.children[cntr].render(ctx, bounds);

That simple change has a drastic effect on performance. DrawSelf now accounts for less than half of the drawing time. The next step will be to make the API friendlier. The code for this post is located at

No comments: