Re: Re[2]: [stella] I did it!(SFCaves 2600)

Subject: Re: Re[2]: [stella] I did it!(SFCaves 2600)
From: Mark De Smet <de-smet@xxxxxxxxxxxxxxx>
Date: Fri, 13 Oct 2000 23:17:19 -0500 (CDT)
After a lot of thought and brainstorming with a friend, I've come back for
more discussion.

> >> - If you have the chance to determine (from the contents of X and Y),
> >> if the left/right half of the line is full/empty you could gain some
> >> extra time (i did this in Thrust)
> 
> MDS> Yes, there will be many lines that are full, but once I get this free
> MDS> time, what do I do with it?  The player objects are never drawn on lines
> MDS> that are full.  There will be few lines that are all empty, and the player
> MDS> objects will be drawn on those, but also partially filled lines.
> 
> Ok, i will try to explain more detailed what i mean:
> 
> In your current scheme, a half playfield line can only
> look like this (this is exactly as in my Thrust-kernel,
> except for number 4, which is not possible):
> 
> 1 |-------     |
> 2 |     -------|
> 3 |---     ----| bottom only
> 4 |    ----    | top only
> 5 |            |
> 6 |------------|
> 
> Each of these halves can be only combined in a very
> limited way. The resulting matrix shows this:
> 
>   | 1 2 3 4 5 6
> --+------------
> 1 | - x - - x x
> 2 | x - - - x x
> 3 | - - - - - x
> 4 | - - - - x -
> 5 | x x - x x x
> 6 | x x x - x x

I don't believe these are the only half line types if I am putting the
cave top and bottom on the same line.  Nor to I agree that this matrix is
correct.  I would agree if I was still doing the cave top on one pair of
lines, and the cave bottom on the next.  However, I think the point we are
working on is to merge the lines so the top and bottom of the cave are
drawn on the same line.  The purpose of this again was so that I could
gain a line free for drawing objects.  More below.

> If you look at it, you will see, that it is never
> necessary to do the AND/OR on both halves. You can
> reduce the problem to the following three cases.
> a. The line goes on and off in different halves (1,2).
>    -> you don't need any ANDs/ORs, just store the data
>    of the correct tables for both halves.
> b. The line goes on and off in the same half (3,4).
>    -> AND/OR one half, the other half is simply $FF/$00
> c. The state of the line never changes.
>    -> this is simply a special case of a. or b.

I think I am following you.  After working on this for a while, I think I
can carry your ideas to what I'm seeing.  However what I'm thinking is
much more complex, but I think it'll work.

Here are the steps I may need to do:
(pseudo code)  I am now using left to refer to the left three bytes that
will go into pf registers and right to the right three bytes.  I am
refering to what I called left before as part 1 and right as part 2.
These two parts result from the fact that I am using two shifts per line
to get the complex cave types.  Part 1 applies in the area to the left of
the apex, and part 2 applies in the area to the right of the apex.

;pseudo code                     ;step #   ;estimated cycles
;for the left half:
load the cave top data part 1    ;1        ;4
load the cave top data part 2    ;2        ;4
OR  togather top data            ;3        ;(part of 2 if 1&2 done)
save to temp if required         ;3a       ;3
load the cave bottom data part 1 ;4        ;4
load the cave bottom data part 2 ;5        ;4
AND togather bottom data         ;6        ;(part of 5 if 5&6 done)
OR togather top and bottom data  ;7        ;4(part of 4 if only 1&4 done)
store to pf or temps                       ;3
;for the right half:
load the cave top data part 1    ;8        ;4
load the cave top data part 2    ;9        ;4
OR  togather top data            ;10       ;(part of 9 if 8&9 done)
save to temp if required         ;10a      ;3
load the cave bottom data part 1 ;11       ;4
load the cave bottom data part 2 ;12       ;4
AND togather bottom data         ;13       ;(part of 12 if 11&12 done)
OR togather top and bottom data  ;14       ;4(part of 12 if only 9&12)
store to pf or temps                       ;3

Now here are the cave types that can drawn in the left half.  (assuming
the cave goes up, then down, or is a degenerate type of this)  As pointed
out below, there are even more, I'll address this later.

1 |------    |
2 |    ------|
4 |   ---    |
5 |          |
6 |----------|
7 |---   ----|
8 |---   ----|
9 |--  --  --|

Sorry, there is no 3, I removed it early on, and I'll get confused if I
renumber them.

To explain these, and to point out how there are no duplicates, here are
some examples.  Remember that these are only the left sides of the screen.
Note the difference between 7 and 8 in the examples.

example 1
--------------------    <-type 6
------------------      <-type 1
----------------        <-type 1
--------------    --    <-type 8
------------    ----    <-type 8
----------    ------    <-type 8
-------     --------    <-type 8
----    ------------    <-type 8
-     --------------    <-type 8
    ----------------    <-type 2
  ------------------    <-type 2

example 2
--------------------    <-type 6
--------------------    <-type 6
---------   --------    <-type 7
------        ------    <-type 7
---      --      ---    <-type 9
      --------          <-type 4
   --------------       <-type 4
--------------------    <-type 6
--------------------    <-type 6
--------------------    <-type 6

example 3
--------------------    <-type 6
--------------------    <-type 6
---------   --------    <-type 7
------        ------    <-type 7
---              ---    <-type 7
                        <-type 5
         --             <-type 4
      --------          <-type 4
   --------------       <-type 4
--------------------    <-type 6

example 4
--------------------    <-type 6
--------------------    <-type 6
-------------   ----    <-type 7
----------        --    <-type 7
-------                 <-type 1
----                    <-type 1
-            --         <-type 9(mostly)
          --------      <-type 4
       -------------    <-type 2
    ----------------    <-type 2
 -------------------    <-type 2

example 5
--------------------    <-type 6
--------------------    <-type 6
-----   ------------    <-type 7
--        ----------    <-type 7
             -------    <-type ?
                ----    <-type ?
     --            -    <-type 9(mostly)
  --------              <-type 4
-------------           <-type ?
----------------        <-type ?

The question marks indicate the cases that I didn't have listed.

Here is my proposition:

To handle example 1, I would need to do these steps from the pseudo code
above:  
1, 4, 7, 8, 9, 10, 10a, 11, 12, 13, 14

To handle example 2, 3, 4 or 5, I would need to do these steps:
1, 2, 3, 3a, 4, 5, 6, 7, 9, 12, 14

So currently I'm thinking that I need two kernals for the up, down case,
one for when the peak is on the left half, and one for when the peak _may_
be on the right.(if there is no peak I am assumeing there is an imaginary
one at the extreme right)  I would be able to remove 2 of the loads(which
are time consuming), and 2 of the AND/OR's.(I realise that they would be a
combined statement.)

Do you follow all this?  Do you agree that it will work?  Are there any
cases or problems that I have missed?  Is there any more simplification?

I would then need to double the kernals for the opposite cave shape(down
then up).  The opposite case would be very similar, the only difference is
in my add/subtract and where the AND/OR are used.

> Normally you would need at least 66(=6*11) cycles to do
> the calculations for both halves. In case a. this can
> be reduced down to 42(=6*7) cycles, case b. needs 44
> (=3*11+11) cycles. If you add the overhead for the
> determination of the case, you will *never* need more
> than about 55 cycles/line.

Using the above, I think it would take 111 cycles.  I'm nost sure where
you are getting your numbers.  I took the estimated cycles for the steps
used, simplifying where possible by doing one of the AND or OR's at the
same time as doing the indexed load.  A quick look at my original code
show that I need about 50 cycles for the overhead(stuff like keeping a
line counter, looping, keeping track of the shifts etc. to make the scheme
work.)  Total appears to be 161, which will in fact fit in less than 3
lines, leaving the last line for my player objects and stores to PF if
needed.

Alternatively, If I do all of the steps, I think it would take 156 cycles,
making for a total of about 206, which might still work anyway.

Do you agree with my above cycle counts?  Is there something that I
missed.

> This will make your kernel far more complicated, but
> gives you a total of 22 free cycles each four lines.
> And you are always short of time, when designing a nice
> kernel for the 2600.

Yes, _much_ more complicated.  I would appreciate help in determining if
this is doable before I actually get to writing it, as I can tell it will
be at least a full day's work to implement this.

> I hope this may help you (and others?).

Everyone has been a great help!  I wouldn't have gotten this far without
it!

Still trying to do to much with a 2600,
Mark


--
Archives (includes files) at http://www.biglist.com/lists/stella/archives/
Unsub & more at http://www.biglist.com/lists/stella/

Current Thread