Re: [stella] I did it!(SFCaves 2600)

Subject: Re: [stella] I did it!(SFCaves 2600)
From: Erik Mooney <emooney@xxxxxxxxxxxxxxxx>
Date: Sun, 01 Oct 2000 21:50:22 -0400
>Side note: can someone clarify for me, is a BEQ that is not taken 3 or 4
>cycles?

Not taken? 2 cycles.  If it's taken, it's 3 cycles... but that becomes 4
cycles if the branch is across a page boundary -- if the high byte of the
program counter changes.  (Same for unconditional branches, too.)

As for the carry flag, remember that only a very limited number of
instructions actually affect it - only compare, add/subtract, and
shift/rotate instructions affect C, and you don't do many of those.  So
after each operation that affects C, look at whether that operation always
results with C in the same state... and if so, you can optimize out the
next SEC/CLC instruction.


>	LDX #40		;number of 'lines' where a line is 4 lines.
>			;  (ie. 4 line kernal)
>
>
>	LDA #00		;2
>
>;this section is for the downward or down then up cave type.
>loopcave1
>	STA WSYNC	;3
>
>
>	STA PF0		;3
>	STA PF1		;3
>	STA PF2		;3
>
>	LDA currleftcol	;3 ;load the column from last line.
>
>	CLC		;2
>
>	ADC leftcave,X	;45 ;add on the shift.

Remember this spot, and look farther down:

>	STA currleftcol	;3 ;save the result for next line.
>	TAY		;2 ;put result into Y for indexing.
>	LDA cavedata1a,Y	;45;load the first byte.
>	STA pfdata1	;3 ;store for playfield data 1
>	LDA cavedata1b,Y	;45;load the second byte.
>	STA pfdata2	;3 ;store for playfield data 2
>	LDA cavedata1c,Y	;45;load the third byte.
>	STA pfdata3	;3 ;store for playfield data 3
>;44
>	LDA cavedata1d,Y	;45;load the forth byte.
>	STA pfdata4	;3 ;store for playfield data 4
>	LDA cavedata1e,Y	;45 ;load the fifth byte.
>	STA pfdata5	;3 ;store for playfield data 5
>	LDA cavedata1f,Y	;45 ;load the sixth byte.
>	STA pfdata6	;3 ;store for playfield data 6
>;65
>	LDA currrightcol	;3 ;load the column from last line.
>
>	SEC		;2
>
>	SBC rightcave,X	;45 ;add on the shift.
>;74
>	TAY		;2 ;put result into Y for indexing.
>;76!!!
>;	STA WSYNC	;3
>
>	STA currrightcol	;3 ;save the result for next line.
>	LDA cavedata2a,Y	;45;load the first byte.
>	AND pfdata1	;3 ;and the halves togather, and load!
>	STA PF0		;3
>	LDA cavedata2b,Y	;45;load second byte.
>	AND pfdata2	;3 ;and the halves togather, and load!
>	STA PF1		;3
>	LDA cavedata2c,Y	;45;load third byte.
>	AND pfdata3	;3 ;and the halves togather, and load!
>	STA PF2		;3
>;33
>	LDA cavedata2d,Y	;45;load the first byte.
>	AND pfdata4	;3 ;and the halves togather, and load!
>	STA PF0		;3
>	LDA cavedata2e,Y	;45;load second byte.
>	AND pfdata5	;3 ;and the halves togather, and load!
>	STA PF1		;3
>	LDA cavedata2f,Y	;45;load third byte.
>	AND pfdata6	;3 ;and the halves togather, and load!
>	STA PF2		;3
>;63
>
>	TXA		;2
>	ADC height	;3
>	TAY		;2
>
>	LDA #00		;2
>
>	SEC	;2
>;74
>;	STA WSYNC	;3
>
>
>;74
>	STA PF0		;3
>;1
>	STA PF1		;3
>	STA PF2		;3
>
>
>	LDA currrightcolb	;3 ;load the column from last line.
>	SBC rightcave,Y	;45 ;add on the shift.
>	STA currrightcolb	;3 ;save the result for next line.
>
>	LDA currleftcolb	;3 ;load the column from last line.
>
>	CLC		;2
>
>	ADC leftcave,Y	;45 ;add on the shift.

This is the last instruction in the loop that affects the carry flag, if I
have this right.  I suspect that the result of this ADC will never
overflow (>255), so therefore the carry will always be clear after this
operation.  Therefore, you can dispense with the CLC up above, before the
first ADC in the loop.

You may well be able to optimize out some more of the SEC/CLC operations
similarly.

>	STA currleftcolb	;3 ;save the result for next line.
>	TAY		;2 ;put result into Y for indexing.
>	LDA cavedata2a,Y	;45;load the first byte.
>	STA pfdata1	;3 ;store for playfield data 1
>	LDA cavedata2b,Y	;45;load the second byte.
>	STA pfdata2	;3 ;store for playfield data 2
>	LDA cavedata2c,Y	;45;load the third byte.
>	STA pfdata3	;3 ;store for playfield data 3
>;52
>	LDA cavedata2d,Y	;45;load the forth byte.
>	STA pfdata4	;3 ;store for playfield data 4
>	LDA cavedata2e,Y	;45 ;load the fifth byte.
>	STA pfdata5	;3 ;store for playfield data 5
>	LDA cavedata2f,Y	;45 ;load the sixth byte.
>	STA pfdata6	;3 ;store for playfield data 6
>;73
>
>
>;	STA WSYNC	;3
>;73
>	LDY currrightcolb	;3 ;load column for this line(already computed)
>;76!!!!
>
>	LDA cavedata1a,Y	;45;load the first byte.
>	ORA pfdata1	;3 ;and the halves togather, and load!
>	STA PF0		;3
>	LDA cavedata1b,Y	;45;load second byte.
>	ORA pfdata2	;3 ;and the halves togather, and load!
>	STA PF1		;3
>	LDA cavedata1c,Y	;45;load third byte.
>	ORA pfdata3	;3 ;and the halves togather, and load!
>	STA PF2		;3
>;30
>	LDA cavedata1d,Y	;45;load the first byte.
>	ORA pfdata4	;3 ;and the halves togather, and load!
>	STA PF0		;3
>	LDA cavedata1e,Y	;45;load second byte.
>	ORA pfdata5	;3 ;and the halves togather, and load!
>	STA PF1		;3
>	LDA cavedata1f,Y	;45;load third byte.
>	ORA pfdata6	;3 ;and the halves togather, and load!
>	STA PF2		;3
>;60
>
>
>
>	LDA #00		;2
>	DEX		;2 ;move counter
>;64
>;	BNE loopcave1	;check if done, otherwise, loop.
>	BEQ noloopcave1	;34;check if done, otherwise, loop.
>	JMP loopcave1	;3
>noloopcave1
>
>	STA WSYNC	;3

(left there so somebody else can check me that none of those instructions
affects the carry flag)

I'll also issue my standard advice for somebody trying to do way too much
with the playfield registers: Figure out if you can rearrange things to
use the reflected PF format, and leave PF0 out entirely, using only PF1
and PF2.  That narrows your usable screen by 20%, but it reduces the
number of load/compare/stores by one-third, which would gain you very much
time in there.

--
Archives (includes files) at http://www.biglist.com/lists/stella/archives/
Unsub & more at http://www.biglist.com/lists/stella/

Current Thread