But another thought: A VCS pixel must be something like a C64 multicolor
pixel, i.e. it's width is doubled, lets say '**' instead of '*' or?
That'd not only result in descent sprites when using 2LKs, but would
still fit to my 8(4 doubled):6 pixels for a rectangular block in a 4:3
screen resolution. Or am I totally wrong with that theory?

Pixels are an arbitrary form of measurement on a video screen.
That's why Atari preferred the term "color clocks" as it referred to the video signal itself.

