Subject: Re: [stella] MultiSpriteDemo update (source+binary)|
From: emooney@xxxxxxxxxxxxxxxx (Erik Mooney)
Date: Tue, 08 Apr 1997 07:29:53 GMT
>>>I wonder why are you asking this :) ...maybe you want to use all the >>>available cycles during a scanline, without even doing the STA WSYNC? >> >>I had a routine that needed to rewrite both GRP registers each line while >>updating all three PF registers twice per scanline (non-repeating > >hmmm... looks like you're working on something very interesting... :) I was working on a kernel for an Arkanoid-type game, using the playfield for the walls and the GRP registers for the capsules (and of course the ball for the ball). It wasn't working out the way I had it - there just weren't enough cycles. Trimming it down to 32 blocks wide and setting PF-Reflect so it didn't use PF0 (see Super Breakout), and restricting the capsules so that only one could be in any vertical zone, necessitating only one GRP write per scanline, worked with just barely enough time left over to check the ball (not using the PHP trick, though.) Positioning sprites while making four playfield writes on the same line is just about impossible, though, so I leave a one-scanline gap between rows of blocks (again see Super Breakout) during which I can reposition RESP1 for the next capsule. >>and checking the ball's Y coordinate and writing to ENABL if necessary > >There's that incredible PHP trick for this... I always wonder if Stella >designers had already it in mind when they created the hardware. That's the trick that sets the stack register to ENAMx or ENABL, compares two memory locations (or any register and one memory location) and then just does PHP because the "equal" flag is bit 1 of P which matches the data bit that ENAxx is looking for, right? It is a very nice routine, fast and does not branch. It works as long as you only want a one-scanline high object, or if you only run the routine every N lines, where N is the height of the object in scanlines. I think the Stella designers must have had this in mind, because what other reason is there for ENAxx using bit 1 instead of bit 0 for data? This works very well for single-height objects. And considering this routine is used in Combat, I'm pretty sure it was intentional. This would not quite work for games such as Centipede/Millipede, which seem to run the enable-player-bullet routine every three lines but it's a six-line object. Is there an easy way to modify the PHP routine for a multiple height object? Looks like you'd SBC the two numbers and if the result is less than N you'd enable the object.. this would work with an easy way of setting the Z flag from the carry flag without a branch. Warning, untested code! For the following code, MissileY equals the LAST scanline on which you want the missile to display, Scanline equals the current scanline number, and the code assumes you want a missile of height 4 scanlines. (so if MissileY = 50, the missile will be enabled for scanlines 47 through 50.) LDX #$1E ;for ENAM1 - could also be used for ENABL or ENAM0 TXS LDA MissileY SEC SBC Scanline ;A has (MissileY - Scanline). If it is >=0 but <4, ;we want the carry clear. CLC ADC #252 ;If 0 <= A <= 3, the carry will now be clear. LDA #00 ADC #00 ;If the carry was clear, A now = 0, so Z is set. PHP ;Plug it into ENABL. This can be optimized to LDX #$1E ;+2 2 TXS ;+2 4 LDA MissileY ;+3 7 SEC ;+2 9 SBC Scanline ;+3 12 ADC #251 ;+2 14 LDA #00 ;+2 16 ADC #00 ;+2 18 PHP ;+3 21 because if MissileY >= Scanline, the carry was set after the SBC so ADC #251 does the same thing as CLC+ADC #252 before.. and if MissileY < Scanline, A=a large unsigned number after the SBC, so the carry will be set whether we add 251 or 252. Not bad, 21 cycles to handle an object of any height.. for more than one object, you have to set S for the first but not afterward.. making 38 cycles for two objects, and 55 for three. With three objects, it leaves enough time to rewrite either the playfield or GRPx, but probably not both. Comparing with an alternate approach, using branching: LDX #0 ;+2 2 LDA MissileY ;+3 5 SEC ;+2 7 SBC Scanline ;+3 10 CMP #3 ;+2 12 BCS L1 ;+2 14 LDX #2 ;+2 16 L1 STX ENAM1 ;+3 19(18 if branch taken) This takes 18-19 for one object, 36-38 for two and 54-57 for three, so the timing is pretty close to the same. Unless someone can optimize one instruction out of either routine? -- Archives available at http://www.biglist.com/lists/stella/archives/ E-mail UNSUBSCRIBE in the body to stella-request@xxxxxxxxxxx to be removed.