Subject: Re: [stella] I did it!(SFCaves 2600) From: Mark De Smet <de-smet@xxxxxxxxxxxxxxx> Date: Mon, 2 Oct 2000 23:30:40 -0500 (CDT) |
> I had a bit of a look at this code. I appreciate the help :-) > Why not include the LDA #0 inside the loop, at the start, instead of > duplicating it at the end.... It doesn't really hurt, as it only adds a few bytes to the code space, which I'm not concerned about. However, I had to do it in order to get my second writes to PFx to occur before the registers are used. I am bit mapping the playfield, so I have to update the PF registers mid line. If I move that LDA as you suggest, then the write will be to late. I had started out with it as you describe, but was forced to do this to make the timing work. > loopcave1 > lda #0 > sta WSYNC > ... [snip/snip] > > ;LDA #00 ;2 << get rid of this = 2 bytes saving Well, doesn't save 2 bytes, simply moves them. > DEX > BNE loopcave1 << restore branch, if <128 bytes... branches > are quicker! > > BEQ noloopcave1 ;34;check if done, otherwise, loop. > > JMP loopcave1 ;3 I wish... I had the BNE, but it exceded the 128 bytes, and I don't know how to make it shorter. > from memory, JMP instructions are 4 cycles, not 3. They're nearly always > slower than branches Ya, but I didn't really have a choice. Bensema's guide to cycle counting says 3. > I also recall writes to memory locations being 3 cycles for zero page, 4 > cycles otherwise. Yes, did I have any STA's marked as something besides 3? All writes are zero page.(that's where all the memory and registers are located.) > You don't need the 1st wsync in the loop, either, if you get your timing > exactly right. another 6 cycles and5 bytes saving per loop. That is what I want to do. > > TXA ;2 > > ADC height ;3 > > TAY ;2 > > And here (above) you have forgotten to clear the carry. It's probably in a > known state, given your earlier subtraction wouldn't overflow... but this is > something you have to fix, or make SURE you know the state of the carry > before you do the addition. You could also consider using a table for the > above... It is known, it is 1 because as you point out, the prevous subtraction won't overflow. However, to save the time, I have adjusted for the extra subtraction by adding 1 to the data. > ldy xplusheight,x ; 5 (or ldx blah,y ... i can't recall > which is OK .. if either!) > > an alternative > > lda xplusheight,x > tay > > Sometimes this is a better way to do it (that is, using a table to do > constant additions for you)... if you needed to set the carry to do the > first method, this is smaller in code size and cycles. Cost is the bytes > required for your table. I'm unsure if your "height" is constant. This is the sort of things I'm looking for, I think I have more rom space than kernal time,l so I'd like to make the trade. As you suggest, height is a variable however. It is limited in scope though, so I'll think about it and see if I can replace it with a table anyway. > > LDA #00 ;2 > > > > SEC ;2 > > ;74 > > ; STA WSYNC ;3 > > > > > > ;74 > Avoid stringing out the SEC so far away from its needed use... unless you > really need it there for cycle counting. > It's too easy for it to get misplaced/lost/changed before it is actually > needed. Ya, I should probably clean that up ;-) > The code seems to load one batch of data, mask out (AND) some bits, save it > all, then load it all up again and OR in some bits, then write to the > registers. It might be possible to improve the efficiency by using both > index registers and doing the and/or in one hit. I haven't spent enough > time with the code to look at the pros/cons... especially regarding the scan > line timing. But here's the gist of it... The AND comes in the first two lines kernal, and the OR in the second. They manipulate differenet sets of data.(well, same data, differenet order, which makes all the difference.) > > phx > > CLC > LDA currleftcol > ADC leftcave,X > STA currleftcol > TAY > > SEC > LDA currrightcol ;3 ;load the column from last line. > SBC rightcave,X ;45 ;add on the shift. > tax > > LDA cavedata1a,Y > and cavedata2a,x > STA PF0 > > ... etc all the way through the bytes > > plx > > This is probably another of those "nice code, but you can't do it that way > (timing!)" things. > But, something to think about :) Good idea. I'm sure I did it the way I did because I did the first half of the code, then added in the second half after the first worked. I think this will work, but I'll have to see if it saves time. Of course there is only one way to find out :^P I think the trick will be timing the placement of the the writes to the PFx to work out right. Thanks for your help! Mark -- Archives (includes files) at http://www.biglist.com/lists/stella/archives/ Unsub & more at http://www.biglist.com/lists/stella/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [stella] I did it!(SFCaves 2600, Andrew Davie | Thread | Re: [stella] I did it!(SFCaves 2600, Erik Mooney |
Re: [stella] I did it!(SFCaves 2600, Thomas Jentzsch | Date | Re: [stella] I did it!(SFCaves 2600, Mark De Smet |
Month |