Hey, I also made a test program and at least on my comp, plain Scanline is faster than anything else...
Download it at http://jaeger.xenoware.de/blttest.zip
And yes, XCESS' transparent blit routine is super-slow. It's actually a translation of a code snippet I found @ MSDN a long time ago.
The transparent-blt routine used in my blt-test program is the same that is used in XCESS.
Bookmarks