Blazing Games Development Blog: optimization

Friday, June 28, 2013

Binary Prerequisites

I recently released an educational game where you play with logic gates. The inspiration for the game came from a JavaScript book that didn't even bother covering logical operations. Logic operations are used to manipulate bytes at the per-bit level making them very useful for optimizing the memory-footprint of data structures. Now that everyone has multiple gigabytes of RAM in their computers, optimizing memory is not as high of priority anymore even if it is fairly easy to do. Instead of preaching about this, I am instead going to give an example of its use and the non-boolean-logic alternative.

For a number of games under development, I have a skill system where players select their bonus skills. Many of the skills have prerequisites. Each skill gets assigned a bit within an integer (if there were more than 32 bits, more than one integer could be used, though for me this isn't an issue as skills are grouped into set that contain 16 skills so only 16 bits are needed). Each skill record has an integer for prerequisites. The character class has an integer for each skill set indicating what skills are known. To see if a skill can be learned, a simple logical AND can be done on the prerequisite and the characters skill set. If the result equals the prerequisite, the skill can be learned.

This can be done without binary logic operators by using an array of booleans. Checking prerequisites the becomes a bit more complicated. First, a meets-prerequisites flag is set to true. A for loop looping through the prerequisites array checks if the corresponding skill flag is set. If it is set, the array of skills in the character sheet then gets checked. If the character sheet skill is false, the meets-prerequisites flag is set to false and the for loop is exited.

As many languages use integers to represent Boolean variables, this is a potential waste of 31 bits. In our example, this would be a memory increase of 16 times the amount. Some compilers have optimizations for memory use in which case the compressing of the bits into a single integer will be done by the compiler, but even in that case the compiler will still have to use the looping code for checking prerequisites with the rather significant speed difference as a result. Of course, in addition to having lots of memory, computers also are very fast now so such concerns can be overlooked. But as writing the code to be efficient in the first place is so easy there really is no excuse for such a waste of memory and CPU cycles.

Monday, November 1, 2010

Animating the Canvas part 9: Getting Dirty Quicker

When I was first starting to learn to program, I was told three steps to follow in order to create good programs. The first step is to get your program running. The second step is to get it running correctly. The final step, which is often not needed, is to get it running fast. In the old days (the 1990's) programmers were more concerned with speed (100MHz was fast!) so a lot of programmers would optimize their code before the program was even running. Often, the optimizations used would make their code messier making it harder to actually get the program running correctly or at all.

The big trick to optimizing is simply the knowledge that in the normal course of running only a small percentage of the code being ran is responsible for the bulk of the time. If you can find the parts of your program that most of the executing time is taking place in, you can optimize those areas and gain a fair bit of speed. Tools like FireBug make this very easy as they have tools that do the tracking for you.

Once you know what code needs to be fixed, the key question becomes how to speed it up. My advice would be to learn Assembly language (aka Machine code). While you may never need to use it, which is unfortunate as it is great fun to code in, knowing what the machine is actually doing at a low level will give you a much greater appreciation for what is going on with your program. Another technique, which I don't do as much as I should, is to simply use the debugger to single step through the function in question. This can give you a huge insight into what is going on.

BGLayers have two parts that need focus on. First, is the adding of dirty rectangles. Next is the rendering of things. The addDirty function is relatively minor CPU wise, but during my discovery at how poor Canvas Clipping is I noticed a lot of Rectangles were being created needlessly and wanted to reduce this. Creating a new instance of an object has some overhead to it so avoiding doing so will speed things up. Right at the start of addDirty, a new instance of a rectangle is created. This was done to support null parameters which add the entire layer, but when you consider that the function will call the same function until the parent is reached, this can be very inefficient.

The problem is that a new instance of the rectangle passed has to be created as it has to be altered to assure it is pixel aligned. While this alteration is minor but could muck up the rectangle if it is used by unrelated parts of the program. When we start clipping the dirty rectangle we use recursion to call the function causing many unnecessary rectangles to be created. The solution is to eliminate the recursion. This is easily done by simply having an array with a list of rectangles that need to be clipped. Initially there is only the initial rectangle in this list. If it is clipped, new rectangles get added to the list instead of having to call addDirty. This eliminates the recursion.

With these simple but major changes, the program is run and no noticeable difference in run time. Well, at least I know there will be less garbage collection. My point about not optimizing where there is no need has proven itself as I wasted almost an hour for no gain. Still, I am happier with the code as recursion has been eliminated making the code nicer.

Next, we try to speed up the part that is taking up over two-thirds of the cycles in my tests. Namely the draw self function. This again leads me back to ancient times (the 90's) when the big bugaboo in graphics programming was drawing pixels multiple times per frame. In those times, accelerated graphics cards were hit or miss and accessing video memory was costly. Things have changed today but eliminating unnecessary drawing is still a good idea. In fact, this was planned from the beginning and is why there is a solid variable to let the renderer know if a layer is solid. Anything underneath a solid layer is not going to be seen so if the portion being drawn is contained within the solid layer, only that layer and any layers above it need to be drawn. This is done in the render function, which the drawSelf function is called from. The rendering code has been changed to only call drawSelf if no solid layers contain the bounds being rendered.

    for (cntr = this.children.length - 1; cntr >= 0; --cntr) {
        if ((this.children[cntr].solid) && (this.children[cntr].findRealPosition().containsRect(bounds))) {
            startDraw = cntr;
            break;
        }
    }
    if (startDraw == -1) {
        this.drawSelf(ctx, bounds);
        startDraw = 0;
    }
    for (cntr = startDraw; cntr < this.children.length; ++cntr)
        this.children[cntr].render(ctx, bounds);

That simple change has a drastic effect on performance. DrawSelf now accounts for less than half of the drawing time. The next step will be to make the API friendlier. The code for this post is located at http://www.blazinggames.com/other/openLibrary/html5Libs.

Blazing Games Development Blog

Friday, June 28, 2013

Binary Prerequisites

Monday, November 1, 2010

Animating the Canvas part 9: Getting Dirty Quicker

Be sure and visit

Blog Archive

About Me