|
Subject: [stella] "Zombie" optimizations From: Thomas Jentzsch <tjentzsch@xxxxxx> Date: Fri, 17 Jan 2003 09:16:59 +0100 |
Hi,
while "desperately" seeking for free cycles to fulfill Glenn's Death
Derby wishes, I remembered some code I already used in Thrust more than
two years ago.
The known 'elegant' DCP-Solution needs 26 cycles and 11 of those cycles
are overhead for just determining wether to draw or not. The new
solution reduces that overhead down to only 5 cycles!
The main difference (and disadvantage) is that it requires a 2nd RAM
register which stores the end of the vertically displayed area of the
Zombie in a very special way. The trick is to use a *signed* compare.
Here is a brief explanation:
Above the displayed area M0_Y (which is simply loaded with the value
that the Y-register will have when to start drawing) is signed
compared with Y. As long as the result is is positive and not equal, the
draw routine is skipped.
When the result is equal, the 2nd RAM variable is copied into M0_Y. This
one is loaded with Y value at the end of the display area ORA #$80
(enabling bit 7). Therefore, as long as Y is inside the display area the
signed compare with Y will always be negative and the Zombie is drawn.
Below the displayed area the compare will become positive again until
the end of the kernel.
Due to the signed compare this trick works on in kernels smaller than
128 kernel loops, which is no problem for 2LK.
For adapting the new routine to DD, I had rearrange the code a bit
(because of the length of the equal branch), so I "only" gain 5 cycles:
cpy M0_Y ; 3
bmi .drawM0 ; 2³ this taken branch costs one cycle
bne .disableM0 ; 2³
;.enableM0:
lda M0_Y2 ; 3 copy the end value into
sta M0_Y ; 3 the compared one
lda #2 ; 2 enable the missile
sta ENAM0 ; 3
bpl .continueM0 ; 3 = 21
.disableM0: ; 8 this branch is executed above
SLEEP 5 ; 5 and below the displayed area
lda #0 ; 2 disable the missile
sta ENAM0 ; 3
beq .continueM0 ; 3 = 21
.drawM0: ; 6
lda (M0_Ptr),y ; 5 bytes stored as: mmmmss00
sta HMM0 ; 3
asl ; 2
asl ; 2
sta NUSIZ0 ; 3 = 21
.continueM0:
M0_Y is initialized with distance from the bottom of the kernel,
M0_Y2 with M0_Y - ZOMBIE_HEIGHT + $81
Example:
M0_Y = $40, ZOMBIE_HEIGHT = 8, M0_Y2 = $40 - (8-1) | $80 = $c9
Y = $7f..$41: N-Flag = 0, Z-Flag = 0 -> skip drawing
Y = $40 : N-Flag = 0, Z-Flag = 1 -> copy M0_Y2 into M0_Y
Y = $3f..$39: N-Flag = 1, Z-Flag = ? -> draw
Y = $38..$00: N-Flag = 0, Z-Flag = ? -> skip drawing
Have fun!
Thomas
_______________________________________________________
Thomas Jentzsch | *** Every bit is sacred ! ***
tjentzsch at web dot de |
----------------------------------------------------------------------------------------------
Archives (includes files) at http://www.biglist.com/lists/stella/archives/
Unsub & more at http://www.biglist.com/lists/stella/
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [stella] Stella Emulator and .P, Erik Eid | Thread | RE: [stella] "Zombie" optimizations, Dennis Debro |
| Re: [stella] Collaboration, Thomas Jentzsch | Date | Re: [stella] Oh No!!, Thomas Jentzsch |
| Month |