Subject: [stella] "Zombie" optimizations From: Thomas Jentzsch <tjentzsch@xxxxxx> Date: Fri, 17 Jan 2003 09:16:59 +0100 |
Hi, while "desperately" seeking for free cycles to fulfill Glenn's Death Derby wishes, I remembered some code I already used in Thrust more than two years ago. The known 'elegant' DCP-Solution needs 26 cycles and 11 of those cycles are overhead for just determining wether to draw or not. The new solution reduces that overhead down to only 5 cycles! The main difference (and disadvantage) is that it requires a 2nd RAM register which stores the end of the vertically displayed area of the Zombie in a very special way. The trick is to use a *signed* compare. Here is a brief explanation: Above the displayed area M0_Y (which is simply loaded with the value that the Y-register will have when to start drawing) is signed compared with Y. As long as the result is is positive and not equal, the draw routine is skipped. When the result is equal, the 2nd RAM variable is copied into M0_Y. This one is loaded with Y value at the end of the display area ORA #$80 (enabling bit 7). Therefore, as long as Y is inside the display area the signed compare with Y will always be negative and the Zombie is drawn. Below the displayed area the compare will become positive again until the end of the kernel. Due to the signed compare this trick works on in kernels smaller than 128 kernel loops, which is no problem for 2LK. For adapting the new routine to DD, I had rearrange the code a bit (because of the length of the equal branch), so I "only" gain 5 cycles: cpy M0_Y ; 3 bmi .drawM0 ; 2³ this taken branch costs one cycle bne .disableM0 ; 2³ ;.enableM0: lda M0_Y2 ; 3 copy the end value into sta M0_Y ; 3 the compared one lda #2 ; 2 enable the missile sta ENAM0 ; 3 bpl .continueM0 ; 3 = 21 .disableM0: ; 8 this branch is executed above SLEEP 5 ; 5 and below the displayed area lda #0 ; 2 disable the missile sta ENAM0 ; 3 beq .continueM0 ; 3 = 21 .drawM0: ; 6 lda (M0_Ptr),y ; 5 bytes stored as: mmmmss00 sta HMM0 ; 3 asl ; 2 asl ; 2 sta NUSIZ0 ; 3 = 21 .continueM0: M0_Y is initialized with distance from the bottom of the kernel, M0_Y2 with M0_Y - ZOMBIE_HEIGHT + $81 Example: M0_Y = $40, ZOMBIE_HEIGHT = 8, M0_Y2 = $40 - (8-1) | $80 = $c9 Y = $7f..$41: N-Flag = 0, Z-Flag = 0 -> skip drawing Y = $40 : N-Flag = 0, Z-Flag = 1 -> copy M0_Y2 into M0_Y Y = $3f..$39: N-Flag = 1, Z-Flag = ? -> draw Y = $38..$00: N-Flag = 0, Z-Flag = ? -> skip drawing Have fun! Thomas _______________________________________________________ Thomas Jentzsch | *** Every bit is sacred ! *** tjentzsch at web dot de | ---------------------------------------------------------------------------------------------- Archives (includes files) at http://www.biglist.com/lists/stella/archives/ Unsub & more at http://www.biglist.com/lists/stella/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [stella] Stella Emulator and .P, Erik Eid | Thread | RE: [stella] "Zombie" optimizations, Dennis Debro |
Re: [stella] Collaboration, Thomas Jentzsch | Date | Re: [stella] Oh No!!, Thomas Jentzsch |
Month |