I was wondering, is it absolutely necessary to completely unroll this if you
had, let's say, a DPC functionality and/or Supercharger RAM?

In other words, if all the data including the HMMx and ENAxx stuff could be
read in via absolute loads (slower than immediate, of course, but faster than
indexed) would it be fast enough?

Someone suggested that I post here in case not everyone follows the AtariAge
blogs. I recently made a tool that converts a BMP file to assembly code for
a 52-pixel wide sprite. It works by combining the 48 pixel sprite with the
missiles and the ball. One of the missiles becomes a 2-pixel sprite with a
combination of HMM0, NUSIZ0, and ENAM0. The larger image takes a lot of ROM
space, and is not as movable as the 48 sprite. I'll leave it to you to weigh
the costs and benefits.

More details and download at:

