Re: [stella] Animating the Marbles

Subject: Re: [stella] Animating the Marbles
From: Paul Slocum <paul-stella@xxxxxxxxxxxxxx>
Date: Wed, 03 Jul 2002 11:07:44 -0500

Hm, if that helps, I found you a few cycles.....

You've got some good ideas. I'm in the middle of converting to 32k and rewriting part of the level data structures so I can spread them across banks. And I just found out that none of the emulators support standard 32k mode without extra RAM. It does work on the Cuttle Cart, though. After I finish this, I'll go over the kernal again and see if I can use your ideas to clear a few more cycles. Thanks for taking a look at it.

The idea of interleaving the color data is great. Originally I had 24 color bytes which didn't require the ASL, but I needed more RAM so I changed it to 12. I probably would have made it interleaved if it had been 12 from the beginning. I just didn't think about it when I did the conversion.

And why did you have to do code 26(!) kernel lines? Can you describe
what is going on in each of those lines? It looks that you are doing the
same over and over and use the remaining cycles for
paddles/colors/looping etc. Maybe with some optimizing you can reduce
the number of kernel lines too.

I got the idea from examining the Maze Craze kernal. This type of kernal helps when you are displaying an asymmetric playfield on each line, but it only changes every few lines (13 in my case). It's especially helpful if you have to do any logic on the playfield data before displaying it. And I do since PF0-left and PF0-right are stored in the high and low nibble of a byte.

The idea is that you store each of the indexed playfield bytes in a buffer when you have extra time towards the start of the loop, and a few scanlines into the loop, you'll have them all in a buffer. Before they're in the buffer you have to read them all with indexes which takes an extra cycle. Plus I have to use four ASLs on the PF0 data. But once I get everything in the buffer I'm saving something like 12 cycles per line. And after everything is in the buffer I don't need X to index the playfield data any more, so I save it in the stack register and transfer one of the playfield buffer values into X which saves me another 3 cycles per line. By the end of the loop you've got tons of extra cycles to do other stuff and get ready for the extra cycles that are needed to loop and fill the buffer again.

I later realized I needed the X register again, which is why there is all that messy X manipulation going on. I should probably see if I can rework that whole thing.

The reason there are two 13 line kernals is to alternate which playfield lines are colored, and to alternate which paddles are read. The kernal is so fragile that I couldn't figure out a way to use one kernal to do this. The hardest part is the coloring. The color change in the middle of the screen has to be timed perfectly.

Even with your current code, the full 256 bytes make no sense, because
the values for Y are not ranging from 0..255 when drawing the ball. You
initialize Y with 156, so that's the maximum of bytes you should have to

The problem is that to display it at the top and bottom I need roughly 140 empty bytes above and below the marble image. That's fine, except there's no way to do it without indexing across a page boundary, which causes an extra cycle on each line. My kernal totally breaks with an extra cycle per line.


Archives (includes files) at
Unsub & more at

Current Thread