Blazing Games Development Blog: Graphics

Showing posts with label Graphics. Show all posts

Saturday, May 10, 2014

NES Hello Sprites 2

NES programs will often clone sprite memory in RAM to allow for DMA transfers. DMA, which stands for Direct Memory Access, is a rapid way of transferring blocks of memory to hardware. When a DMA transfer happens, the processor is temporarily halted while the memory is rapidly copied into PPU memory. While this still takes time, it is much faster than manually copying sprites. If you have a large number of sprites, it is far better to use DMA than to manually copy sprites.

Another convenience that the NES provides programmers is a VBlank Non-Maskable Interrupt (NMI). Interrupts cause the processor to stop what it is doing and immediately start running the subroutine pointed to by the interrupt vector. There are two types of interrupts. Your regular interrupts, or IRQs, can be disabled by the SEI instruction and enabled by the CLI instruction. NMIs are not affected by those instructions so will always occur. The PPU control register, however, does allow the VBlank to be disabled.

The advantage of using interrupts is that you do not have to worry about polling for the VBlank so you can process other game logic while the PPU is drawing the screen without having to worry about missing this important period. In this second sprite demo, all of our logic will be done in the interrupt handler, though this certainly does not have to be the case and in later projects we will be developing a state machine to handle complex processing.

The code for the second version of our sprite demo is very similar to the original version with only a few changes needed to support interrupts and DMA. Instead of copying the sprite data directly to the PPUs sprite memory, it is copied to a page of RAM. VBlank interrupts are enabled just before the radically changed main loop as follows:

LDA #%10000000 ; enable NMI
STA $2000

MainGameLoop:
NOP ; Right now we do nothing in the main loop.
JMP MainGameLoop

Yes, absolutely nothing is done in the main loop. Instead all work is done in the Interrupt. While the NOP instruction is not necessary, I have heard urban legends about loops causing hardware to catch fire. I find this highly suspect but spreading the loop over an extra instruction does give me an excuse to use my favorite 6502 instruction. The real logic happens in the Non-maskable interrupt routine. Though thanks to DMA hardware this is much simpler than the manual method of transferring sprite information.

NMI:
LDA #2
STA $4014
JSR MoveSprites
RTI

As you can see, DMA is very simple to use. You just store the page of memory you want to transfer to the PPU into $4014. The move sprites function is almost identical except it stores sprite positions to RAM instead of writing directly to the PPUs sprite memory which has the added benefit of being more compatible with emulators (always a benefit for home-brew games).

Thursday, May 1, 2014

Maze 3D postmortem

While there was not enough interest in Coffee Quest 2600 to warrant developing it, I really wanted to see if my rendering technique really would work on the 2600 with the limitations the system has. This game is the result and proves that CQ2600 could be done. I would love to develop a proper 2600 RPG though it is doubtful this will happen. Still, here is the postmortem for my 2600 3D maze game.

What Went Right

Stella was the code name for the 2600 console and is also then name of an open source 2600 emulator. If you ever decide to develop a home-brew 2600 game, this is an awesome emulator to use during development. The built in debugger is simply wonderful to use. Not only does it have your break-points and stepping through code capabilities, but you also see where the cathode ray is when stepping through the code. As some of the hardest programming/debugging involves the timing of the cathode ray, this is a wonderful feature. Other handy features include actually seeing what the TIA registers hold and smart disassembly. The disassembler appears to track exactly which parts of the ROM are used for instructions and what parts are used for play-field and sprite data showing graphical representation of the data. This debugger made getting the game working much easier!

Mixed Blessings

The game required reworking the display that was used in my Coffee Quest 2600 game. While a lot of the logic used in that game was usable, there was a fairly large amount of code that needed to be reworked. The decision to have a top bar with a compass in the centre was a direct result of playing Coffee Quest 2600 as I found that it was way too easy to get disoriented while playing the game. The compass at least makes it possible to know which direction you are heading in. I was also going to have coordinates displayed on the bar, but the amount of time and effort to add this was too high so this feature was left out of the game.

What Went Wrong

"Racing the Beam" is a phrase used to describe programming the 2600. This is sort of correct as the entire program revolves around the timing of the display. However, especially when you are doing more complex things such as asymmetrical play-fields, you are not trying to keep ahead of the beam but need to wait for the beam to get ahead of you. With asymmetrical play-fields, you set up the play-field registers for the first half of the display but can only set up the registers for the second half of the display only after the beam has drawn the register you are changing otherwise what is being drawn is incorrect. This results in ugly code where some of the logic that should be done at the end of the block has to be done in the middle of the block. This is so that the beam can draw the play-register that needs to be changed.

This was a very interesting challenge as such tight constraints in memory and timing are rarely issues with modern day programs. With so little time to get things done, seemingly simple things such as shifting mask registers takes too long and has to be replaced with small look-up tables. Yet look-up tables take up precious memory so it is an interesting balancing act.

Thursday, April 24, 2014

NES Hello Sprites 1

For my first demonstration of sprites I decided to go with your typical bouncing sprite. Instead of using a ball, the text message "Hello Sprites!" Is used. Due to the NES limitation of only allowing 8 sprites on a line, the message is broken into two lines of text. This version manually manipulates sprite memory instead of using the more common DMA method that will be covered next. To my surprise, this actually lead to issues with the emulator if ran with the default fast PPU emulation though worked fine when ran with the slower PPU emulator. This is a good reminder that emulators are not always accurate, though developing with them is much faster than developing on real hardware.

The complete source code for this project is at http://blazinggames.com/books/NES/. Only the most relevant sections of code are included.

The program starts out with pretty standard initialization code. The screen is then filled with a checkerboard pattern instead of blank spaces so it will be clear that the sprites are appearing above the screen image (background). Next we have the task of setting up the sprites. In our case we are just copying this info from a table in ROM into the PPU Sprite Memory. Variables for holding the top corner of the block of sprites and the movement directions are then set up. The main loop simply waits for the VBlank flag to be set and then calls the moveSprite function which has all the movement logic.

; initialize the sprites
LDA #0
TAY
STA $2003
; fill sprite memory with data from ROM
InitSprites_dataLoop:
LDA hello_sprite_data,Y
STA $2004
INY
CPY #52
BNE InitSprites_dataLoop

; fill rest of sprite memory with 255 so that it will not be visible
LDA #255
InitSprites_fillLoop:
STA $2004
INY
BNE InitSprites_fillLoop

; set sprite variables
LDA #0
STA SPRITE_X
STA SPRITE_Y
LDA #1
STA H_ADJUST
STA V_ADJUST

The MoveSprites function is fairly large but also simple. The first chunk of code shows how the coordinates are adjusted. Each frame the block's top corner is adjusted by the current direction variables. Hits against the screen bounds are then performed with hits resulting in the direction being changed. What may be confusing, and something I will definitely be writing about in the future, 255 is used for -1.

MoveSprites:
; First adjust top corner of sprite block
LDA SPRITE_X
CLC
ADC H_ADJUST
STA SPRITE_X
; Has hit left edge?
BNE MoveSprite_checkRightEdge
; if so set horizontal adjust to +1
LDA #1
STA H_ADJUST
MoveSprite_checkRightEdge:
; if has hit right edge?
CMP #192
BNE MoveSprite_vertical
; if hit right edge, set horizontal adjust to -1
LDA #$FF ; -1
STA H_ADJUST
; vertical tests similar to above and not included here, see source file for full code.

Once we know the coordinates for the top of the block of sprites, we can calculate the position of the sprites. As the bottom calculations are the more complex ones, here is the code for setting the bottom line of sprites.

; we reach here when top row done, so prepare bottom row
LDA SPRITE_Y
CLC
ADC #8
STA TEMP_Y ; Y coordinate to use for bottom row of sprites
LDA SPRITE_X
STA TEMP_X ; X coordinate starts at block x coordinate
; Y register already correct so no need to set
LDX #8 ; 8 sprites in bottom row need to be counted
MoveSprite_bottomRowAdjust:
STY $2003
LDA TEMP_Y
STA $2004 ; write adjusted sprite Y coordinate
INY ; Adjust index to point to sprite X data
INY
INY
STY $2003 ; tell PPU we want to write sprite X
LDA TEMP_X
STA $2004 ; Write sprite X coordinate
CLC
ADC #8 ; add 8 to this coordinate for next sprite
STA TEMP_X
INY ; set up index for next sprite
DEX ; and adjust countdown
BNE MoveSprite_bottomRowAdjust

As you can see it is fairly simple. Most NES programs, however, do not manipulate sprites directly as we are doing so we will rewrite this program to take advantage of interrupts and DMA,as well as explain what interrupts and DMA shortly but next week will be a postmortem possibly followed by something else.

Friday, April 11, 2014

NES PPU Ports

Over the weekend I finished an Atari 2600 3D maze game which I will be releasing shortly (possibly as part of a large announcement) but if anybody wants early access to the ROM email me.

One of the things that both the NES version of Hello World and the 2600 version had in common is mapping the graphics processor to memory addresses. This a very common way of communicating with devices when using assembly language. The PPU condenses this to only 8 addresses. The audio unit, Sprite DMA, and IO ports are separate maps, but I don't consider them part of the PPU. As has been pointed out earlier, the PPU has it's own memory. The addresses here do not correspond to PPU memory but are simply registers used to control the PPU.

$2000 is the first of two control registers, with bits used to control the PPU settings as follows:
Bits 0 and 1 controls the screen memory (name table in NES terminology) with 00 being PPU address $2000, 01 being $2400, 10 being $2800 and 11 being $2C00.
Bit 2 enables vertical writing, meaning that when set writing to PPU memory increments the next write address by 32 bytes. As the screen has 32 columns this has the effect of writing a vertical strip of screen memory, which is exceedingly handy for side-scrolling games.
Bit 3 selects which of the two pattern tables the sprites will use for their images.
Bit 4 selects which of the two pattern tables the screen will use for its character set.
Bit 5 selects the sprite size. 0 for 8x8 sprites, 1 for 8x16 sprites.
Bit 6 not used
Bit 7 VBlank interrupt enable. When set will cause an interrupt to be triggered when a vertical blank is occurring.

$2001 is a second control port with bits as follows:
Bit 0 not used
Bit 1 Hides the left most column of the screen if 0
Bit 2 Hides sprites in the leftmost column of the screen if 0
Bit 3 Shows screen if set, blank screen when 0
Bit 4 Enables sprites when set.
Bit 5 intensifies Red colors
Bit 6 intensifies Green colors
Bit 7 intensifies Blue colors
I have heard that only one of the color intensifier bits should be set.

$2002 PPU Status gives status of PPU
Bits 0-5 not used
Bit 6 set when a visible bit in sprite 0 intersects a visible background bit. It is set only when the scan line with the intersection is occurring which means it can be used to trigger events on certain scan lines.
Bit 7 VBlank flag. Set when a VBlank has occurred. Once read it becomes 0 until the next VBlank, but many emulators do not emulate this behaviour.

$2003 Sprite Memory Address writing to this register will set the memory address that $2004 will point to for reading and writing sprite data. It should be pointed out that there is only 256 bytes of sprite memory so only a single write to this address is needed to access all of sprite memory.

$2004 read or write the sprite memory address that was set by $2003. Auto increments the next read or write will be the next address.

$2005 Screen Position written to twice to set the position of the screen. This is used to scroll the screen as if set to non-zero the portion of the screen outside of screen memory will come from one of the other name tables.

$2006 PPU Memory Address writing to this register will set the memory address that $2007 will point to for reading and writing PPU data. It requires two writes to set an address. The first write sets the high byte with the second being the low byte. Yes, this is the opposite of how the 6502 does things.

$2007 read or write the PPU memory address that was set by $2006. Auto increments the next read or write will be the next address. If vertical writing is enabled (via $2000 bit 2), next read/write address will be increment end by 32.

Programming the PPU is simply a matter of reading and writing to the appropriate registers. The sprites are a special case as if you are using a lot of sprites there is a much more efficient way of setting them. But before we cover sprite DMA, it would probably be a good idea to discuss sprites.

Friday, March 14, 2014

The NES Screen

Now that we have an ASCII character set, we are now ready to display something. The display memory is broken into two parts called the NAME table and the ATTRIBUTE table. They are paired together and the NES supports up to four of them. The NAME table is a 32x30 grid of bytes that store the indexes of which characters to display. The palette used to display the character is determined by the ATTRIBUTE table. This is where things get really confusing.

There are only 64 bytes of attribute memory yet 32 x 30 = 960. there are only four palettes so only two bits per tile are needed. Even then, there is only enough memory for a quarter of the display. The obvious solution to this problem is to group the tiles into 2x2 blocks that share the same palette. This is what is done, but instead of storing the colour information consecutively, the PPU groups 2x2 blocks into another 2x2 block as shown in the figure below.

Up to four screens can be set up, but the NES only has enough memory built in for two screens. Additional RAM can be added if four way scrolling is required, but many games only need horizontal or vertical scrolling. If addition NAME/ATTRIBUTE table memory is not present, the missing memory is mirrored copies of existing memory based on the scrolling mode that is set.

The easiest way of thinking about the four screens is to think of them as a 2x2 grid of screens. You then set the pixel position of the display which determines which parts of the four screens are displayed.

Now that we have an overview of what is required to display a piece of text, we are ready to create a hello world program which is what we will be doing next.

Saturday, March 8, 2014

The ASCII Connection

It is important to remember that the characters that make up a character set are whatever you want them to be. If you are using a memory manager, you may have multiple character sets and be able to swap them as needed. That means that there are no rules as to where letters appear in the character set or even if there are any alpha-numeric characters at all. If the game you are creating doesn't have any text in it, then there is no requirement to have any letters in the set. Even if your game has text, it may not be necessary to have the entire set of letters or you may even incorporate the words as part of other images.

All this freedom is really nice to have, but one thing to remember is that the assembler you are using to convert the 6502 assembly language code into machine language does have a format it uses. While this is not directly relevant to the resulting code that is produced, this fact can be taken advantage of to reduce the amount of work that you will need to do. You see, any strings that you use get stored as ASCII values. If you are not using ASCII then instead of using strings such as "Hello World" in your code, you will need to have a table of numbers with the numbers being the character indexes you used for the letters you want to display. This is simple but time-consuming work that is easy to avoid.

ASCII, which stands for American Standard Code for Information Interchange, is a rather old standard that was created in the early 60's to allow for a standardized way of transmitting information between computers. It is a 7-bit standard because 6-bits was insufficient for all the characters needed and 8-bits would require an extra bit be transmitted for every character which was deemed too costly. Remember that next time you send a multi-megabyte image to someone!

The first 32 characters (indexes 0 through 31) are control codes so these characters can be used for any graphics you want as they don't have a visual representation. 32 is the space character, 48 through 57 are numbers 65 through 90 are upper case letters, 97 through 122 are lower case letters and between these are various punctuation marks. This layout may seem largely random but if you convert the indexes to hexadecimal then you will notice that displayable characters start at 20, numbers at 30, letters at 41 and 61.

I am sure that if one looked, they would be able to find a NES ASCII Character set, but making your own set reduces the chance for any legal claim. Here is an image of my character set that I will be using as the starting point for the games that I will be developing. It is not the greatest character set, but is good enough for my needs. Any characters that are not needed can easily be replaced with other graphics, though I suspect that this will not be necessary for my projects.

Now that we have characters that can be displayed, we are ready to take a look at how to actually display things on the screen.

Friday, February 28, 2014

NES Character Sets

A character set, also known as a tile set or pattern table, is simply a collection of 256 8x8 tiles which are used to build the display. There is no built-in character set for the NES so at least one needs to be provided. Without using any memory managers (a complex subject that will be covered when developing the map games) the NES supports 2 character sets. One for building the game screen with and optionally a second set for sprites.

There are a number of tools available for the creation of NES character sets, with YY-Chr being the tool that I currently am using. There are others available and the format for character sets is simple enough that it would be fairly easy to create your own tool for editing or converting images into character sets. As some of the tools I am creating for editing map data is similar to what would be needed, I may develop a very simple character set editor myself. It is also possible that the cartridge contains RAM for the character set in which case you can manipulate the character set in code.

Each character is 16 bytes with with the 256 characters in consecutive order. In other words, the first 16 bytes are for the first character (index 0), the next 16 bytes are for the second character (index 1), and so on. As characters make use of a 4-color palette, two bits are needed for each pixel. This is where things can be confusing as instead of putting the bits together, they are stored as separate bit-planes. Bit-planes store color information by combining separate proper bitmaps to form the color as shown in the following image.

The image shows the high bit first followed by the low bit, which is the opposite order that the NES expects the planes to be stored. The first 8 bytes represent the low-order bits with the next 8 bytes being the high-order bits.

Now that we know how graphics are stored, we need to create a character set for use in the trivia game that we are ultimately going to make. That will be covered next.

Friday, February 7, 2014

NES Palette

Traditionally, one writes a hello world program when they first start learning a new language or platform. Unfortunately, there is a lot of graphics related material that one must understand before they can write Hello World for the NES, so the next few NES articles will cover the PPU. I will start in higher level concepts then finish by writing a hello world program in 6502 Assembly Language.

Before I begin, I should point out that today the term bitmap is incorrectly used to indicate a raster graphics image. Technically speaking a bitmap is a two-color image with each pixel of the image being a bit. I am using the term to represent a grid of pixels that form an image. This is the common usage of the word, even if it technically is not the correct usage.

Today it is fairly simple to create and manipulate images. When writing code to manipulate bitmaps most the time you are dealing with pixels in the RGBA (Red Green Blue Alpha) format. This means that each color component is a byte an each pixel requires 32 bits. As today memory is so much cheaper than it was a couple of decades ago, this is not that big of issue. This is a fairly inefficient way of storing graphics as not every color available is used. Older graphic cards/chips reduced the total number of colors available to a small number of colors. Instead of specifying the color in RGBA format, bitmaps would just hold a number indicating what palette index to use. The NES has a 64 color palette as shown below. As you can see, 8 colors are not actually available and there are a couple of colors that are almost identical, so the palette is not really64 colors.

Using only 6 bits per pixel is certainly a drastic reduction from using 32 bits per pixel, but this can be taken even further. If we make the image a grid of 8x8 pixels and give each tile in this grid it's own 4-color palette taken from the 64 color palette. To reduce things even further, we don't really need a palette per tile, instead lets have four palettes that the tiles can use. Sprites can also have their own separate set of palettes for a total of 8 possible palettes.

This is very simple. Simply to set the palette you simply send 32 bytes to the PPU, with each byte being what color from the 64-color palette to use. Of course, things are not quite that easy. The first color in the first palette represents the background color. This will be the same color for the other 7 palettes so the first of the four colors is always the background color. Due to the way the PPU is set up, writing a color into the first sprite-palettes first color will also set the background color so when putting together the palettes you need to keep the background color in mind.

This now leads to the tiles. Having every tile uniquely programmable would take a lot of memory. Instead, lets limit the tiles to 256 different tiles and call this a character set. Character Sets will be discussed next.

Thursday, May 3, 2012

TMoD episode 4: Space Shield

When researching different disasters the idea of a solar flare was high on the list. The big problem was finding a classic-style arcade game that would work with the theme. Thinking about how to stop a solar flare lead to the obvious placement of satellites, even though realistically you would need way too many. Placing satellites immediately got me thinking of the space junk problem that we are having. This immediately got me thinking about the road-crossing games I played on my cousins Atari 2600 so the game was set.

One of the big issues with the game was the collision detection. Unfortunately, Flash does not have per-pixel collision detection (at least not built in to Flash Player 10, I haven't researched Stage 3D or Starling enough to know if Flash Player 11 does). The object detection is bounding box based. Point based detection does support a per pixel option so I set off on my task to write a simple per-pixel collision detection routine. I originally opted for a simple brute force approach. Flash let's you find the bounding box of a clip relative to a container, so getting the bounding boxes of the two colliding objects then finding the intersection of the two objects is easy. My thought was then to look though the points in the intersection rectangle to see if they both flag collisions at the same points and are therefore overlapping.

In my tests this worked great. The problem is that my tests used vector shapes. When bitmap images were used, anything within the bitmap bounding box was flagged even if it was transparent. There is really no excuse for this other than lazy or incompetent programming at Adobe. This meant that I would have to look at the bitmap data myself. Sadly, this is not a very efficient thing to do.

To get the pixel information, I need to draw into a bitmap data object. This first requires getting a copy of the transformation matrix used to draw the display object. Thankfully this is easily accessible. Next, a translation is applied to only draw the intersection portion of the image. The drawing is done. Repeat for the second object in a second BitmapData object and then compare the pixels in the two bitmapData objects. As this is done using a getPixel function call repeatedly, I honestly can't see this as being very efficient. Then I remembered an article I read about using color filters. This got me thinking about how I could get rid of one of the bitmapData objects and combine the two images into one.

First, you set the bitmapData background color to black. The draw function supports color filters making it possible to max out or clear a color channel. By maxing the red and green channels while dropping the blue channel of the first image then dropping the red channel of the second image while maxing out the green and blue channels of the second image while drawing it over the first image using the ADD filter you will end up with white pixels where the two images overlap. There is actually a built in routine for finding a bounding box containing a particular color so finding the overlapping pixels is easy.

This is not the best way of doing the collision detection, but it works. There are some edge cases where it won't work, but that isn't a problem in this game. It is sad that such a straightforward thing requires so much work.

Friday, April 6, 2012

Easter Egg Hunt

I am a bit late writing this this week. I normally post my site and this blog on thursday evening as that way I can keep my friday's free. This week I ended up going out on thursday. Last week on Blazing Games, I posted the open source version of my Easter Egg Hunt game where you navigate a 3D maze in order to find a dozen easter eggs. The eggs are randomly generated. The 3D maze was generated using a very old technique known as the painter's algorithm. I was thinking that none of the techniques used in the game are overly noteworthy then I realized that a lot of the techniques in the game, while not useful when you have OpenGL (including ES and WebGL) to work with, may still be relevant if you were limited to a 2D drawing system like the HTML 5 Canvas. HTML 5 developers, at least the ones that are targeting Internet Explorer, may need to fall back on these old techniques. This is due to MS deciding not to support WebGL.

The egg is simply a palette swapping varient. There are 12 patterns for the eggs, but the two colours that are used to draw the pattern are randomly selected. This is a really old technique that you will see in older video games where the same monster sprite is used repeatedly but given different colours so the same artwork can be used for a large number of monsters. This would be a little tricky to do with the HTML 5 canvas as I don't recall a way of changing an image's palette. I would have to look into this, but if you can't change the palette, you can use the image data to manipulate individual pixels.

The 3D maze uses a technique known as the painters algorithm. Essentially you treat the 3D maze as a series of blocks. You draw the furthest away blocks first, then the nearer blocks, and so on until you are painting the close blocks. The downside to this is you are potentially overdrawing many times. The nice thing about this technique is it is very easy to implement and the results can look very good. Of course, you aren't going to get 360 degrees of smooth scrolling movement, but it is something that can be done on any browser that supports the canvas.

It would be nice if there was a hardware accelerated 3D standard that would work on all browsers, but even if that does happen, it won't be for years. Perhaps for next Easter, I will port the game to HTML 5, though I am hoping I come up with a far better idea for an Easter game before then.

Thursday, September 15, 2011

DDTe3 Hours 11 Drawing Hexes

In order to manipulate hex-shaped images, masks are going to have to be used. In canvas parlance, clipping regions are the proper term. Clipping is one of the ugliest parts of the canvas API, as it uses a shrinking region logic. You have to save the canvas state before setting up the region, set up the region, do the clipping based drawing, then restore the state of the canvas. Thankfully clipping is powerful enough to allow paths to be used to create a clipping region. This means that if we can create a simple hex-drawing function, that hex can be used as a clipping shape for creating hex-shaped puzzle pieces.

Actually, instead of drawing a hex, creating an array of points is more useful, especially when it comes time to creating the border. The hex, when you look at it, really only has a small number of important numbers that need to be calculated. Figure 1 shows the special points that are needed to create the hex and listing 1 shows the function used for creating the array of points.

figure 1

function buildHexPath(rect, pathlist)
{
   var path = pathlist;
   // make sure path list of points contains proper number of points
   if (path == null)
       path = new Array();
   while (path.length < 7)
       path.push(new BGLayers.Point());
   while (path.length > 7)
       path.pop();

var halfx = rect.x + rect.width / 2;
   var x2 = rect.x + rect.width;
   var y1a = rect.y + rect.height / 4;
   var y1b = rect.y + 3 * rect.height / 4;
   var y2 = rect.y + rect.height;

   path[0].x = rect.x;
   path[0].y = y1a;
   path[1].x = halfx;
   path[1].y = rect.y;
   path[2].x = x2;
   path[2].y = y1a;
   path[3].x = x2;
   path[3].y = y1b;
   path[4].x = halfx;
   path[4].y = y2;
   path[5].x = rect.x;
   path[5].y = y1b;
   path[6].x = rect.x;
   path[6].y = y1a;
   return path;
}

The hex can now be drawn, but to create a border we need a pair of hexes. My first attempt to create a border hex was to create an inner hex and an outer hex and join the two together. This did not work as expected. If you look at figure 2, the first hex is the normal hex. The second hex is suppose to be the border. My immediate reaction was that the fill method didn’t support complex polygons. This isn’t that surprising of a limitation so I quickly wrote my plan B implementation.

figure 2

The third hex in this figure was created by drawing each segment of the hex separately. This was done in a similar way to the first attempt at creating a border. A outer hex and an inner hex are created and then the points are looped through and combined to create six separate polygons.

function drawHexBorder(ctx, rect, pct)
{
   var outpath = buildHexPath(rect, null);
   var wadj = rect.width * pct;
   var hadj = rect.height * pct;
   var inpath = buildHexPath(new BGLayers.Rectangle(rect.x+wadj, rect.y+hadj, rect.width - wadj-wadj, rect.height - hadj-hadj),null);
   for(var cntr = 0; cntr < 6; ++cntr) {
       ctx.beginPath();
       ctx.moveTo(outpath[cntr].x,outpath[cntr].y);
       ctx.lineTo(outpath[cntr+1].x,outpath[cntr+1].y);
       ctx.lineTo(inpath[cntr+1].x,inpath[cntr+1].y);
       ctx.lineTo(inpath[cntr].x,inpath[cntr].y);
       ctx.lineTo(outpath[cntr].x,outpath[cntr].y);
       ctx.closePath();
       ctx.fill();
   }
}

Once this was done I decided to spend a little bit of time researching how the cavas fill was suppose to work as I was positive that I had seen some drawings that had holes in it. After a few minutes of research (about the same amount of time it took me to write plan-b) I finally stumbled upon the winding rule for filling. It appears that the direction of the edge lines are used as part of determining of the points that are filled. While I have played with a number of fill algorithms, I have never actually implemented a winding algorithm but it does mean that by reversing the order of the points in the inner hexagon, the fill should work properly. This would allow the drawing of the border as a single drawing operation which should be more efficient.

Monday, November 1, 2010

Animating the Canvas part 9: Getting Dirty Quicker

When I was first starting to learn to program, I was told three steps to follow in order to create good programs. The first step is to get your program running. The second step is to get it running correctly. The final step, which is often not needed, is to get it running fast. In the old days (the 1990's) programmers were more concerned with speed (100MHz was fast!) so a lot of programmers would optimize their code before the program was even running. Often, the optimizations used would make their code messier making it harder to actually get the program running correctly or at all.

The big trick to optimizing is simply the knowledge that in the normal course of running only a small percentage of the code being ran is responsible for the bulk of the time. If you can find the parts of your program that most of the executing time is taking place in, you can optimize those areas and gain a fair bit of speed. Tools like FireBug make this very easy as they have tools that do the tracking for you.

Once you know what code needs to be fixed, the key question becomes how to speed it up. My advice would be to learn Assembly language (aka Machine code). While you may never need to use it, which is unfortunate as it is great fun to code in, knowing what the machine is actually doing at a low level will give you a much greater appreciation for what is going on with your program. Another technique, which I don't do as much as I should, is to simply use the debugger to single step through the function in question. This can give you a huge insight into what is going on.

BGLayers have two parts that need focus on. First, is the adding of dirty rectangles. Next is the rendering of things. The addDirty function is relatively minor CPU wise, but during my discovery at how poor Canvas Clipping is I noticed a lot of Rectangles were being created needlessly and wanted to reduce this. Creating a new instance of an object has some overhead to it so avoiding doing so will speed things up. Right at the start of addDirty, a new instance of a rectangle is created. This was done to support null parameters which add the entire layer, but when you consider that the function will call the same function until the parent is reached, this can be very inefficient.

The problem is that a new instance of the rectangle passed has to be created as it has to be altered to assure it is pixel aligned. While this alteration is minor but could muck up the rectangle if it is used by unrelated parts of the program. When we start clipping the dirty rectangle we use recursion to call the function causing many unnecessary rectangles to be created. The solution is to eliminate the recursion. This is easily done by simply having an array with a list of rectangles that need to be clipped. Initially there is only the initial rectangle in this list. If it is clipped, new rectangles get added to the list instead of having to call addDirty. This eliminates the recursion.

With these simple but major changes, the program is run and no noticeable difference in run time. Well, at least I know there will be less garbage collection. My point about not optimizing where there is no need has proven itself as I wasted almost an hour for no gain. Still, I am happier with the code as recursion has been eliminated making the code nicer.

Next, we try to speed up the part that is taking up over two-thirds of the cycles in my tests. Namely the draw self function. This again leads me back to ancient times (the 90's) when the big bugaboo in graphics programming was drawing pixels multiple times per frame. In those times, accelerated graphics cards were hit or miss and accessing video memory was costly. Things have changed today but eliminating unnecessary drawing is still a good idea. In fact, this was planned from the beginning and is why there is a solid variable to let the renderer know if a layer is solid. Anything underneath a solid layer is not going to be seen so if the portion being drawn is contained within the solid layer, only that layer and any layers above it need to be drawn. This is done in the render function, which the drawSelf function is called from. The rendering code has been changed to only call drawSelf if no solid layers contain the bounds being rendered.

    for (cntr = this.children.length - 1; cntr >= 0; --cntr) {
        if ((this.children[cntr].solid) && (this.children[cntr].findRealPosition().containsRect(bounds))) {
            startDraw = cntr;
            break;
        }
    }
    if (startDraw == -1) {
        this.drawSelf(ctx, bounds);
        startDraw = 0;
    }
    for (cntr = startDraw; cntr < this.children.length; ++cntr)
        this.children[cntr].render(ctx, bounds);

That simple change has a drastic effect on performance. DrawSelf now accounts for less than half of the drawing time. The next step will be to make the API friendlier. The code for this post is located at http://www.blazinggames.com/other/openLibrary/html5Libs.

Sunday, October 17, 2010

Animating the Canvas part 8: Splitting Image

Right now the library is good at drawing solid blocks. To be useful, however, better image support is needed. The one class that would make this library sufficient for the creation of games, which is the goal of this after all, would be an image layer that had support for image strips and image grids. Better yet, an image atlas since image strips and image grids are a subset of an image atlas. This is actually very simple to do.

An image atlas, for those who are not familiar with the term, is simply just a large image that contains smaller images. The idea here is that instead of dealing with a large number of image files, you instead only need to load one image and show portions of that image. This speeds up loading as you only need to request a single file instead of having to request multiple files from the server. It is also commonly used with 3D rendering but that is beyond the scope of this series.

What makes this class so easy to create is simply the fact that it inherits most of its functionality from the Layer class. JavaScript does not currently have support for subclasses so I make use of the inheritProperties function that was created earlier. As it is likely that the image layer will be used to display an entire image, the constructor simply contains the image to use. The portion of the image used is set to the entire image, though this can be changed any time allowing this class to be used as an image atlas.

BGLayers.ImageLayer = function(id, image)
{
    this.supr = new BGLayers.Layer(id, image.width, image.height);
    BGLayers.inheritProperties(this, this.supr);

    this.clip = new BGLayers.Rectangle(0,0, image.width, image.height);
    this.image = image;
}

To set the particular portion of the image that should be displayed, the setClip function sets the new portion of the image to be shown. This can be used for animation by simply having a strip or grid of frames and altering the portion being shown every frame.

BGLayers.ImageLayer.prototype.setClip = function(rect)
{
    this.clip.x = rect.x;
    this.clip.y = rect.y;
    this.clip.width = rect.width;
    this.clip.height = rect.height;
    this.addDirty(null);
}

The hard part comes with the drawing as the portion of the image within the clipping bounds. This has to be calculated. This is done in a similar way to the calculation of the real layer position but using the clipping portion of the image. In theory it is possible to just set the canvas clipping region and draw the whole image, but I have had some issues with clipping so figure it is safest to avoid clipping whenever possible. Besides, relying on clipping makes the assumption that the canvas implementation is actually efficiently written where as manually drawing just the clip guarantees that the minimum amount of drawing is being done.

BGLayers.ImageLayer.prototype.drawSelf = function(ctx, bounds)
{
    if (this.findRealPosition().intersects(bounds) == false)
        return;
    var rect = this.realPosition.getIntersection(bounds);
    var scaleX = this.clip.width / this.realPosition.width;
    var scaleY = this.clip.height / this.realPosition.height;
    var boundClip = new BGLayers.Rectangle(
                this.clip.x + (rect.x - this.realPosition.x) * scaleX,
                this.clip.y + (rect.y - this.realPosition.y) * scaleY,
                rect.width * scaleX,
                rect.height * scaleY);
    ctx.drawImage(this.image, boundClip.x, boundClip.y, boundClip.width, boundClip.height,
                rect.x, rect.y, rect.width, rect.height);
}

And that is all that is needed. Creating a new layer is that simple. The atlas functionality of this class should be adequate for most uses so already this library is useful. Still, while doing some debugging on my Tarot project, stepping through the drawing of things did help me notice a lot of inefficiencies in how the library works so I am planning on having an optimization pass. I also want to do a bit of fine tuning to the library to make it clearer what variables are private, even though JavaScript does not support private variables. I may be renaming some things and adding some functionality so there is no need to access private variables. If something is clearly private and somebody modifies it anyway breaking their program it is their problem.

Sunday, April 20, 2008

The Lorenz Attractor

As Professor Edward Lorenz died of Cancer on Wednesday, I thought it would only be fitting that today's article be about the Lorenz Attractor which is one of the earliest (if not the first) example of chaos theory. Chaotic systems are those in which a very small change can have a drastic impact on the outcome of the system over time. The butterfly effect is probably the best known example of this. Essentially, a butterfly flapping it's wings can as a result alter the weather weeks in the future. This happens because the extremely small change in the weather pattern can slowly be amplified over time. This is also the reason why we are so poor at predicting the weather.

Professor Lorenz discovered this fact in the early sixties when he was trying to develop a computer model of the weather. Back then computers were not that powerful so the professor simplified the model so that it consisted of three differential equations:

dx/dt = 10(y - x)
dy/dt = xz + 28x - y
dz/dt = xy - (8/3)z

When running these equations, Professor Lorenz found that if he tried to pick up the equations from where he left them in an earlier run that the outputs were not the same. After verifying that the computer was in fact working properly and that there were not any bugs in his program, he isolated the problem down to the precision in which he was entering the information. He was only entering the data to 3 decimal places where as the computer was keeping the data up to 6 decimal places. Did I mention that this was the sixties and that the computer was primitive?

The result of this is the discovery of Strange Attractors. Other forms of chaos, such as fractals and the Mandelbrot set, have since been discovered and have been used in the generation of some of the most incredible graphics and animations. There are a number of open source programs that can plot these equations for you and there are certainly many books on the subject. I personally feel that it is more fun to write your own generator, but then I like programming and find the programming aspects as much fun (if not more fun) as looking at the results.

Monday, June 18, 2007

Flash Bitmap automap

At the moment the amount of spare time I have is very low, and to make matters worse I am taking a drawing class. I am hoping that they will help me get my artwork up to a much better quality than it currently is. As with every other skill, the key is practice but knowing what you are doing wrong certainly helps.

I am still going to put an effort into getting my 10 hours of work on the site selected project, with any slippages of hours being rolled over to the next week's total. I did manage to finish a rough automap render but it only paints a test pattern as the map classes aren't developed yet because I decided that I wanted to make sure there would be something to post this Friday. This was done a bit different from the way I have normally handled such maps in Flash. Flash 9 has a Bitmap Data class which I am taking advantage of. Technically, this was introduced in Flash 8, but I am jumping from flash player 7 content to flash player 9.

The BitmapData class lets you use an image which is loaded with a separate loader class. I am not sure why you need a separate class that handles the loading of an external image, as it would make more sense to me to have this functionality part of the Bitmap or BitmapData class. Once copied into a BitmapData class, clips from the image can be easily copied to the bitmap used for the automap. This should be fairly fast as I believe the BitmspData class is able to use hardware acceleration, or at least native code for bitmap clipping.

At this point I have a test pattern being displayed, using the 16 tiles that the game uses. If I get around to finishing the map classes before Friday (assuming they can be done in less than 10 hours) I may have a proper map this friday, but if not I will at least be able to post the game running the test pattern.

Blazing Games Development Blog