[stella] Idea for Scrolling Maze Engine.

Subject: [stella] Idea for Scrolling Maze Engine.
From: Schwerin <schwerin@xxxxxxxx>
Date: Sun, 7 Feb 1999 01:08:37 -0500
Wow!  It's so great to see the 6502(7) discussed!  It feels like my small
isolated programmer bubble has expanded into a medium size isolated
programmer bubble...but enough small talk.  Being an eager new convert
to the land of 2600 programming and the TIA, I've ingested a few of the
major documents and decided to write a small kernal to see if I have
any clue how any of this works.  Before I go on...

DISCLAIMERS:
1) This is not a finished, nor even well developed idea, but more of a
"test of concept".  Let me know if I am sniffing up the wrong tree in
my approach, because I know it needs optimizing.
2) I know the STA WSYNC's are missing
3) I know I write the PF registers at the wrong times.

Now that you're expecting the worst :) ...


It's assembly code that trys to draw a moving playfield.  Here's my idea.

The spaceship is always locked to the center of the screen, but it can face
any of the four orthagonal directions.  (I don't deal with drawing the ship
yet)

Scrolling can happen in any one of the same four directions, N S E W

Assume an assymetrical playfield, which will mean 40 blocks of horizontal
resolution (4 color clocks each).

The vertical resolution is irrelevant at this point, but assume doubling of
scan lines for extra compute time, and assume that at the end of the day
the maze looks squarish and not rectangular.

Our basic maze unit is an eight by eight outline box, in units of playfield
blocks.

fig 1.

********
*      *
*      *
*      *
*      *
*      *
*      *
********

We have maze data in memory that tells us which sides of the box to draw
and which not to draw.  If you tile "boxes" vertically and horizontally you
get a flexible (and common) maze engine of arbitrary dimensions.  Each box
is free to have some sides drawn and others missing.

You can draw about 5 boxes across a line, actually 6 if scrolling makes some
partial.

Now simplify all of this to the problem of drawing the bottom/top
of the boxes.  It will look something like this:

fig 2.

********        ********        ********

assuming a simple case...

Now lets say you have a 6 bit variable "maze" which tells you when to
draw and when not to draw....let's call each flag A,B,C,D,E, and F
and they are yes/no flags on whether to draw.

Let's say that A is our "partial" line to the left, which may or
may not even appear on screen, followed by B, and then C, D, E, and F (F
must appear partially or in whole)

So our flags correlate to the screen like this...

fig 3.  (shifted 2 blocks right for example)

PF0  PF1     PF2    ~PF0  ~PF1   ~PF2
|  ||      ||      ||  ||      ||      |
AABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFF

Note that there are 2 As and 6 Fs so each is partial.

Store the flags in your "maze" variable like this:

bit 5 4 3 2 1 0
    F E D C B A

The other value we need is a shift value, which ranges from 0 to 7 blocks
of right displacement.  ("0" means B flag is leftmost) store it like this:

bit  4  3  2  1  0
     S2 S1 S0 -  -


Now we have defined the problem (somewhat).

Given "maze"  ... a set of flags on hline drawing
Given "shift" ... a horizontal offset from normal drawing

Create, in a timely fashion, 40 bits of picture data, as stored in
PF0, PF1, and PF2, the picture field registers and you need to
swap the data at the middle of the screen for the right screen side.
(set to copy, and not reflect mode).

Mission Impossible?  I think it's a fun little problem...

Here's my attack...

Create a lookup table that eliminates bit shifting the picture data.

Here's how it works....for any PF register or its right side equivalent,
use the "shift" and two of the maze flags as an index.  Load the byte from the
table and store it to the register...some registers need to load two bytes
from the table and ORA them together for reasons that are easier to draw
than explain.  All this depends on the "coincidence" that the boxes are 8
bits wide...this means that most registers are only affected by two drawing
flags, some by three, but we always know exactly which are affected by what.

PF0 (first 4 blocks) affected by flags A and B
PF1 (next 8) flags A B and C
PF2 (next 8) flags B C and D
altPF0 C and D
altPF1 D and E
altPF2 E and F

This can be cleaned up if you find a scheme so that no register is affected
by more than two flags.  (Or also, you could create a lookup table that uses
three drawing flags in it's index.)

The index is computed like this, combining "shift" with 2 draw flags

index bits 4  3  2  1  0
           S2 S1 S0 B  A      <--- the "BA_index"
           S2 S1 S0 D  C      <--- the "DC_index"
           S2 S1 S0 F  E      <--- the "FE_index"

this is the same as   BAindex = shift OR (maze & 3)

So each table index comes out 0 to 31
There are 9 tables.

PF0BA table gives you the PF0 register for the shift and A B flags index.

PF1BA gives you the PF1 as influenced by flags B and A

PF1DC gives you the PF1 as influenced by the C flag (D line never drawn here)

PF2BA gives PF2 (B influenced)

PF2DC gives PF2 (D and C influenced)

altPF0DC gives right hand PF0

altPF1DC

altPF1FE

altPF2FE


I know it would be easier to understand if you had the tables to look at
(twist my arm, they take long enough to draw on paper :)

So for each scanline (or two scanlines rather) we need to

Access the maze flags
Generate the table indexes
Load the table data.
Overlay some table data with OR statements
Store the data to the PF registers at the right times.
(do other graphics and misc overhead)

If you assume 76 cpu clocks per scanline,
we need the 152 you get from line doubling.

Here goes..I'll count the cycles for you...
(assume no page-crossing indexes)
(remember the DISCLAIMERS)

top:

lda maze    (3)  [3]  get our maze flags from somewhere zero page
and #3      (2)  [5]  mask for the low two bits A and B
ora shift   (3)  [8]  mix in with the shift value to create BA_index
tay         (2)  [10] save index in Y register
lda PF0BA,Y (4)  [14] read table
sta PF0     (3)  [17] note: i don't time these register writes correctly.
                      we can get around this by buffering the 6 bytes.
lda PF1BA,Y (4)  [21] different table, same index
sta tempPF1 (3)  [24] workspace so i can combine table values
lda PF2BA,Y (4)  [28] read table
sta tempPF2 (3)  [31] workspace
lda maze    (3)  [34] we need flags C and D now
lsr         (2)  [36] shift right for divide by 2
lsr         (2)  [38]
and #3      (2)  [40] mask for C and D
ora shift   (3)  [43] make DC_index
tay         (2)  [45]
lda PF1DC,Y (4)  [49] read table
ora tempPF1 (3)  [52] mix with previous data to make picture value
sta PF1     (3)  [55]
lda PF2DC,Y (4)  [59] read table
ora tempPF2 (3)  [62] mix
sta PF2     (3)  [65]
-----compute right side-------
lda altPF0DC,Y (4) [69] read table
sta altPF0     (3) [72] don't copy until correct time, or something like that
lda altPF1DC,Y (4) [76] read table
^^^^^^^^^this is aproximately where we reach scanline #2??^^^^^^
sta altPF1      (3) [79]
lda maze        (3) [82]
lsr             (2) [84]
lsr             (2) [86]
lsr             (2) [88]
lsr             (2) [90] get E and F flags
ora shift       (3) [93]
tay             (2) [95]
lda altPF1FE,Y  (4) [99]
ora altPF1      (3) [102]
sta altPF1      (3) [105] buffer for later?
lda altPF2FE,Y  (4) [109]
sta altPF2      (3) [112]
lda altPF0      (3) [115]
sta PF0         (3) [118]
lda altPF1      (3) [121]
sta PF1         (3) [124]
lda altPF2      (3) [127]
sta PF2         (3) [130]
......etc.................that's under 152 cycles and without much
tinkering yet.


Congratulations if you got this far, I know I'm getting a bit blurry eyed
after pondering this problem for a lot of the day.

My current idea is to go for 6 bit table indexes rather than 5 bit, so there
are 64 table entries, and 8 tables, to reduce the number of ORA's that happen.
Also, you simplify computing the index to eliminate LSR's:

given a precomputed-once-per-frame shift and shift_

       shift  is bit 5  4  3  2  1  0
                    S2 S1 S0  -  -  -

       shift_ is bit 5  4  3  2  1  0
                     -  -  - S2 S1 S0


ABC_index is bit 5  4  3  2  1  0  which is shift OR (maze & 7)
                 S2 S1 S0 C  B  A

DEF_index is bit 5  4  3  2  1  0  which is shift_ OR (maze & 56)
                 F  E  D  S2 S1 S0

The tables are:
PF0ABC,
PF1ABC,
PF2ABC,       PF2DEF,      <--- ORA number1
altPF0ABC, altPF0DEF,      <--- ORA number2
           altPF1DEF,
           altPF2DEF

the 8 tables come to .5K which may be a limiting factor for 2K or 4K formats
if that is a goal.

I think that's enough for now.

-Andrew Schwerin
schwerin@xxxxxxxx



Tune in next time for Part II: Drawing the Vertical Walls....



--
Archives (includes files) at http://www.biglist.com/lists/stella/archives/
Unsub & more at http://www.biglist.com/lists/stella/

Current Thread