[stella] BAStella Language Reference V0.000000001

Subject: [stella] BAStella Language Reference V0.000000001
From: "Roger Williams" <mer02@xxxxxxxxxxxxx>
Date: Tue, 20 Nov 2001 19:42:20 -0800
Over the past few days I've been giving some thought to the old
question of whether a higher-level language than Assembler is
possible for Stella, and I think I've thought of a way to do it.
This document is a "thought experiment" I wrote in an effort to
see if the concept makes sense.
Among the things I'd like to hear from my comrades here...
1. Is this a good idea, or would it just ruin the mystique?
2. Is it workable as presented?
3. Is anybody interested in using it?
4. What features or functions have I forgotten?  (Note: I've
   deliberately left out quite a few for resource reasons.)
Please let me know what you think.
--Roger Williams
BAStella Language Reference V0.000000001
by Roger Williams
November, 2001
At this point this is a thought experiment in what a compiled
language for Stella might look like. I am however seriously thinking
of implementing this language,  as it's the first time I've had such
an impulse where I don't have to worry about the target system
becoming obsolete :-)  This may be regarded as a request for
comments, suggestions, and criticism from the Stella programming
community.  Read "IS" as "WILL BE" throughout the document.
I am absolutely confident that I can implement what is described
here (until someone points out an error) but given my current
(over-)work situation it might take me a year or so to do it if
I sign on.  Thus my keen interest to find out if there is an
actual interest before I embark on the project.
If I build this thing I will not put it in the public domain, but
will do something like a GPL so that it can be extended and
distributed at no charge with proper credit to all parties concerned.
BAStella is an integer BASIC cross-compiler for the Atari 2600 VCS.
It is written in QuickBASIC, but with little use of QB specific
features so it should compile with little or no modification on any
similarly structured BASIC variant.
While BASIC may not be the most appropriate language for compiler
design 8-P I'm very familiar with it, and it seems appropriate as
it was the personal computer lingua franca of Stella's generation.
While no compiler can perform the optimizations which make advanced
Stella games possible, BAStella offers the beginning programmer
an environment which simplifies the construction of simple games like
Pong, Combat, and Space War.  BAStella allows a surprising degree of
creativity for its ease of use, and I anticipate that it will inspire
an array of hacks comparable in spirit, if not in magnitude, to those
made by Stella machine language programmers to extend their working
I have also tried to preserve Stella's spirit in a number of ways so
that the advanced BAStella programmer will have a running start if
he wishes to progress to 6502 assembly language.

BAStella Environment
BAStella virtualizes the kernal.  To a BAStella programmer, 2600
sprites really ARE like Commodore 64 sprites -- just set the
table source, X and Y locations, and height, and it appears on
the screen.
BAStella code runs asynchronously with the kernal.  This is
accomplished by running it during VBLANK, but peppering the object
code with timer checks so that the kernal can be invoked when
necessary.  If the check is done when the accumulator does not
hold valuable data (a condition which occurs frequently in compiler
object code) it should take about 8 cycles to clear the path for
the next ~200.  
It should be fairly simple to make sure that
checks are scattered appropriately, with at least one in every
loop and none separated by too much code.
The default BAStella kernal draws all five screen objects over
a non-asymmetrical playfield, but other kernal options are
available including scores, and the kernal types can be mixed
on a single screen.
Because BAStella code is asynchronous, the wise BAStella
programmer checks the system FRAME counter (incremented with
each kernal call)  to control the speed of play.  BAStella is
more forgiving than machine language of the occasional overlong
calculation, since FRAME can be allowed to slip on occasion.
BAStella code can catch up with less labor-intensive frames,
while the machine code which lets VBLANK slip will cause the
screen to jump.
BAStella takes an input file which must have a .bas extension,
and creates two identically named files with extensions .bin
and .lst.
The .bin file is ready for use by an emulator or to burn into
a 4K cartridge.
The .lst file details the RAM map and the addresses of all
statements and tables, and any errors found during compilation.
BAStella doesn't create a re-assemblable .asm file, since
it would greatly complicate BAStella and DiStella can do that.
(I may change my mind on this, as it depends on some nuts and
bolts I haven't fully worked out.)
If BAStella finds errors in the source, it creates a .BIN
which displays the error message on a 2600.  This is useful
for the developer who is scripting back and forth between an
editor, BAStella, and an emulator.
This version of BAStella will be limited to non-bankswitched
4K carts and Stella's native 128 byte RAM.  (I see the greatest
audience being those who want to plug a cart they've made
themselves into a real Stella, but who aren't up to cycle
counting in assembly.)  I have some ideas for supporting some
bankswitch and outboard RAM schemes but these will come later
when the basic concept is proven.

BAStella Language
BAStella is a semi-structured BASIC variant much like the one
it's written in.  Line labels are named, not numbered, and
only one instruction is permitted per line (BAStella's one main
Visual Basic "feature").  BAStella doesn't have a lot of
specialized I/O instructions since most control is effected by
reading or assigning to the reserved system variables, some of
which are the familiar PIA and TIA registers from VCS.H and some
of which control the virtualized kernal(s).
 LET var = exp
 (or just var = exp)
 PLUCK var
This is provided for registers that dont need a particular value,
like WSYNC, to write whatever happens to be lying around to the
indicated address.  This will be most useful with CXCLR to clear
the collision registers.

LABEL: GOTO LABEL 'comments after apostrophe
(Note that, unlike DASM, BAStella labels must end with a colon:
as well as beginning at column 1, since other code can begin at
column 1.  Like most BASICs, BAStella also isn't case-sensitive.)

 REM comment for nostalgia's sake
 IF exp THEN LABEL '1-line branch if exp true
 IF exp THEN
Note there is no IF exp THEN statement : statement : statement

 CASE exp
 CASE exp
 CASE exp
SELECT CASE is a bit more efficient than successive if...thens,
especially if the CASEs are restricted to constants, a
condition BAStella will detect and optimize automatically.
[note: this may become a requirement, I haven't decided yet.]

 ON exp GOTO label0, label1, label2, label3...
(ON...GOTO is an oldie but goodie which can be implemented even
more efficiently than SELECT CASE, since the CASEs are both fixed
and consecutive so DEC can be used to successively check them.)

 FOR var = start TO finish (STEP step)
 WHILE exp
Block IF, FOR, and WHILE loops can be nested.
 GOSUB label
label: (code)
 CALL label
 SUB label
BAStella subroutines and functions can be nested, but if you
do this you will need to increase the stack size with a STACK
directive.  BAStella does not support parameter passing on
the stack because it's expensive both in RAM and CPU cycles.
This means CALL/SUB/END SUB is just a prettier GOSUB/RETURN.
(User-defined functions are OK, since they return their
results in the current math accumulator.)  You can also legally
CALL or GOSUB a function, since the return value will just be
ignored.  Functions cannot have arguments because of no math

Most Stella setup and I/O functions are performed by assignments
to system variables, but two big exceptions set up the kernal and
     Type, argument [,argument...]
     Type, argument [,argument...]
MAKE SCREEN may appear anywhere in the code, but like DIM and
TABLE its output will be placed in a special memory area
(with the other library and support code).  A MAKE SCREEN block
causes a kernal to be compiled out of calls to component kernal
Type elements.  Some of these elements require one or more
arguments to set them up, such as height in lines or height of a
Playfield or Score "big pixel."  All Type arguments must be
constants -- they cannot be calculated on the fly.
Types are not numeric but are named by keywords, so as to make
the code somewhat readable.  One key improvement to later
versions of BAStella will be addition of kernal component types.
SCREEN kernal Types in this release are:
 ALL5   = All 5 screen objects + split playfield
     ALL5, height [,playfield-bigpixheight]
 ASYM   = 2 sprites + asymmetric playfield
     ASYM, height [,playfield-bigpixheight]
 SCORE2 = 2 2-digit scores
     SCORE2 [,bigpixheight [,bigpixnum]]
 SCORE4 = 2 4-digit scores
     SCORE4 [,bigpixheight [,bigpixnum]]
 SCORE6 = 1 6-digit score
     SCORE6 [,height]
SCORE2 and SCORE4 are always defined in "big pixels" whose height
may in turn be specified.  Default is 5 big pixels each 8 screen
lines high.  SCORE6 pixels are always 2 screen lines high but may
be of any variable height, default 8 2-line pixels.
While SCREEN type arguments are given in elements of 1/192 of the
screen height, for consistency with single-pixel positioning
commands, the 2-line kernals only have 2-line accuracy when
interpreting them, so nonzero bit zeroes will generally be ignored.
If any type's HEIGHT element is omitted, it will extend to the
bottom of the screen.  If the total of all height elements specified
is less than 192, the last type specified will extend all the way to
the bottom.

 SETPF pftable
SETPF fully sets up the playfield from a table created by
the TABLE pftable AS PLAYFIELD directive, described below.  More
direct methods are supported but SETPF allows the programmer
to draw the playfield with the text editor, without worrying
about bit manipulations.

BAStella Math
BAStella supports 8- and 16-bit signed and unsigned math.
BAStella sports a fully functional _expression_ evaluator which
understands precedence of operation and parentheses, and even
performs logical operations in the same way as more advanced
languages (e.g. operators like > and = resolve to zero or $FF).
 ()  Parentheses
 *   Multiplication
 /   Division
 \   Remainder after division
 +   Addition
 -   Subtraction
 +B+ BCD addition with Carry (see BCDCARRY function)
 -B- BCD subtraction with Borrow
 =   Equals
 <>  Not equal to
 !=  Same as <> for C-heads :-)
 >   Greater than
 <   Less than
 >= or =>  Greater than or equal
 <= or =<  Less than or equal
 AND Logical AND
 OR  Logical OR
 EOR Logical Exclusive OR, as in 6502
 XOR Same as EOR, more usual in BASIC
 ABS(exp) Absolute Value
 RANDOM  Random Number (May be reseeded by using
   it on the left side of an assignment)
 NOT(exp) Logical Negation
 ROR(exp) 6502 shift operations (but supported
 ASL(exp) to 16-bit precision if called for)
 ROCARRY  0 or 1 Carry for ROL and ROR
   (May be assigned or read)
 MSB(exp) Most Significant Byte of 16-bit as 8-bit
 LSB(exp) Least Significant Byte of 16-bit as 8-bit
 (MSB and LSB can also be used on the left side of an
 assignment statement.)
 BCDCARRY 0 or 1 Carry for BCD math
   (May be assigned or read)
 (How else to accumulate a 6-digit score?)

BAStella RAM management
A crucial bottleneck for any Stella project is the tiny 128-byte
RAM.  While BAStella is an incredibly primitive un-optimized
compiler (because it's the first compiler I've ever written) it
does manage RAM very carefully.
The BAStella RAM memory map looks like this:
 $FF Stack (typically 10+ bytes)
 SYSTOP System vars called for
  by functions used
 VARTOP DIMmed variables
 VECTOP Addressible Vector
  NAMEd variables
 $80 ...
SYSTOP is generally $F6 unless it's been overridden by a
STACK directive (which specifies the number of EXTRA bytes,
above and beyond what BAStella wants internally).  SYSTOP,
VARTOP, and VECTOP are determined during the first
compilation pass.  These three pointers are available as
read-only system vars in BAStella, referenced from memory
location $80=zero. 
I normally don't like requiring variables to be DIMensioned,
but it's kind of a duh thing here.
The Vector is a one-dimensional unsigned byte array which
grows upward from $80.  You can place named variables within
the vector (so as to control their positions, for example)
with NAME.  You can NAME a location more than once, and you
can NAME 16-bit locations.  This can be used, for example,
instead of LSB and MSB to split up 16-bit values.  Or you
can switch between signed and unsigned access by using a
different NAME.  The raw vector itself is named ! and is
of type SBYTE:
 LET A = !(30)
 !(29) = B
You can also use NAME to force an out-of-memory error, since
BAStella won't NAME a location higher than VECTOP.  Otherwise
BAStella has no way of knowing that you have plans to index
vector entries up to a certain value. (On the other hand, at
runtime, BAStella will like a certain other very popular
language I can think of cheerfully overrun RAM if you address
out of range vector and table locations.)
In the LST file BAStella details the reasons for RAM allocation,
including the individual system vars and the functions
responsible for their inclusion, so that you can have some idea
what needs to be trimmed if you run out.
* Wonk question: Where's the math stack?
BAStella's math stack lives in the compiler.  At runtime
BASTella only needs an accumulator and operand register,
total 4 bytes if you use 16-bit math.  BAStella also can't
support 2-argument functions (e.g. MIN, MAX, etc.) because
of this.
* Wonk question #2: How do you support FOR...NEXT?
Nearly all languages implement FOR...NEXT (or its moral equivalent)
with a stack for nesting.  BAStella supports nested loops but it
doesn't use a stack.  Instead, during the first pass BAStella
identifies all DIMmed variables used as loop counters.  Loop
counters get additional storage allocated for the TO target.  If
STEP is used, they get another byte or word allocated for that too.
This means that in BAStella, you don't have EXIT FOR.  You can
legally branch right out of a FOR loop.  But on the other hand
it's in your interest to reuse counter variables, especially if
they're 16-bit or have a STEP.
BAStella lets you use NAMEd variables as FOR counters, but you're
on your own as far as making sure the additional storage is there
for the endpoint and step.

BAStella Variables & Constants
 DIM var (,var2, var3...) AS type
 NAME constant = var AS type
Var is the name by which you will later refer to the specified
storage location.  If the NAMEd constant is greater than $FF,
it is treated like an absolute memory address.  If it is less
than $FF then $80 is added to it so that it addresses the range
"zero" or RAM $80 to VECTOP.  All useful RAM addresses below
$80 are acessible as system vars with the same names they have
in VCS.H.
The types are:
 BYTE unsigned
 SBYTE signed byte
 WORD unsigned
 SWORD signed word
All vars can be treated as arrays by adding an index to them.
Arrays and tables of word data are accessed by word index. 
You can create a signed or word "vector" easily enough:
 NAME 0 = WordVec AS WORD
The topmost legal entry will be INT(VECTOP/2).
While vars and tables (see below) are defined differently, they
are treated the same by the compiler.  This can cause great
mischief if you aren't careful.  Arrays and tables grow upward
in both the vector and memory.
DIMmed vars are assigned in the order found as the source file
is scanned, working down from VARTOP.  If a var is used as a
FOR index the TO target is located above it, and STEP above that.
BAStella allows you to use NAME to name FOR indeces, but you must
be sure that the appropriate storage locations are free above the
index or you will get very surprising results.
 TABLE var AS type
   'comments are allowed here, but no code
   const, const, const...
   const, const, const...
As in DASM, in tables and everywhere else constants can be
decimal, $Hex, or %binary.  BAStella also supports :graphic
in which a binary value is developed with space or 0 as bit
0 and any other character as bit 1.  The PLAYFIELD table type
is an extension of this.  :graphic data must be the last
thing on the line, as they cannot be delimited.  Usually
they will be written one per line anyway, as that's the point.
The constants in a table can also be labels referring to
memory addresses or vector entries.  (Vector entries will
be adjusted to memory addresses when written.)  This can be
useful for indexing score character sets.
TABLE definitions (like DIMs) may be located anywhere in the
source, but they will be arranged from the top of ROM down in
the order found during the first-pass source scan.
TABLE supports a useful extra type:
    'again, comments allowed
    :   X   X   X
    :  X X X X X X
    :   X   X   X
    'Colon or % can begin a line,
    'zero or space is blank
This statement actually creates 3 tables (var0, var1, var2)
for the playfield, reversing PF1 and shifting PF0 as necessary.
Each valid line consists of a colon or percent sign followed by
space/zero and other chars until the end of the line.  If the
line is too short it is padded with spaces; if too long, it is
truncated.  New entries are added to each of the three PF
tables until END TABLE is encountered.
Blank lines and comments are ignored.  The individual PF tables
can be referenced as var0, var1, etc. and var itself can be used
as the target of a SETPF statement. 
The playfield vectors can be individually manipulated
via the system vars PF0, PF1, and PF2.  This works both ways;
one can query PF0 to locate the current working PF0 table, or
LET PF0=INDEX(mytable) to set a new PF0 table without affecting
the other playfield settings.
The optional WIDE directive creates an asymmetrical playfield.
If this is specified, six tables var0 through var5 are created
for the asymmetric playfield kernal and the lines may contain
up to 40 meaningful characters instead of just 20.

_expression_ Evaluation Details
Every individual math operation (relational operation or function)
is performed in one of the four types (BYTE, SBYTE, WORD, SWORD).
Some operations are unaffected by truncation or sign treatment
(addition) and others are radically affected (multiplication,
division, magnitude comparison).  The type of evaluation can change
throughout the evaluation of an _expression_ as some operations (like
magnitude comparison) return different types of results than their
source data, and source data types are mixed in operations.  BAStella
tries her best to convert between types, but some (especially
signed/unsigned conflicts) must be handled manually.
The Types can be used as functions to force all evaluations within
them to be done at the indicated precision. For example,
 A = SWORD(3 * B / 4)
forces all calculations to be done as signed words, even if
A is a byte and B is unsigned.  Type forcing can be nested:
 A = SWORD(3 * (BYTE(C/D) / 4)
In this case the C/D calculation will be done as unsigned bytes,
but all other calculations will be done as signed words.  Since
Stella's screen is 192 lines tall, a number which is positive in
practice but negative in 2's complement signed byte arithmetic,
the distinction can be important.
Conversions from 16-bit to 8-bit types will be truncated (and
this will ruin signed math).  Conversions from 8-bit to 16-bit
math will, if the source is signed, return $FF for the high
byte if the source is negative, preserving its sign.

Not Found in VCS.H
These are control variables treated in many ways like VCS.H
references, but actually controlling BAStella virtualized
None of these are included unless actually used by a BAStella
feature which has been used, like a SCREEN component or any
reference at all to FRAME.
 P0X Player 0 X position
 P0Y          Y position
 P0H          and height
 P1X and Player 1
 M0X Missile 0, it's always there unless off-screen
 M0Y vertically
 M0H height, width determined by VCS.H standby NUSIZn
 M1X and Missile 1
 BLX and the Ball
 BLH width determined by VCS.H standby CTRLPF
 PF0 These refer not to the hardware PFregs, but
 PF1 to the beginning of tables for kernals that
 PF2 will keep the PFregs updated.
 PF3 And these are used by the asymmetric playfield
 PF4 kernal for the other half of the screen.
 FRAME Counter incremented each time kernal is run
 ACCLO Used by the _expression_ evaluator, if 16-bit
 ACCHI operations are called for  (If only 8-bit
 OPERLO operations are used, only OPERLO is used)
 PFOFFSET  If used, sets the first bigpix line of the
  first playfield kernal to a variable height,
  so the playfield can smooth-scroll vertically.
  If unused it's unallocated and a constant is
  hard-coded.  If used, it's initialized with
  the default bigpix width for the first playfield
  kernal component by SET SCREEN.
 SCOR0 16-bit pointers to score character maps.
 SCOR3 Thru SCOR3 required by SCORE2 kernal
 SCOR5 Thru SCOR5 required by SCORE6 kernal
 SCOR7 Thru SCOR7 required by SCORE4 kernal
 (Yep, 16 bytes for scoring.  I assumed the programmer would
 want the flexibility.  It might be prudent to do less RAM
 intensive versions that fix the character set, though.)

Hello, World
At this point I see this as the starting point BAStella program:
 'Default kernal is ALL5
 SETPF helloworld
 : X X XXX X   X   X   XXX
 : X X X   X   X   X   X X
 : XXX XXX X   X   X   X X
 : X X X   X   X   X   X X
 :   X X XXX XX  X   XX
 :   X X X X X X X   X X
 :   X X X X XX  X   X X
 :    X  X X X X X   X X
LOOP: GOTO LOOP 'don't do anything
At this point, this is the End of the Document.
Current Thread