I wonder.. maybe TImage with TBitmap's as sprite and back buffers would be sufficient for this. Because back buffer only needs to be 320x240, then set TImage property stretched to make it feel the nostalgy
SDL may be good choice (never used myself), where OpenGL and DirectX suffer in speed if you need to manipulate texture pixels on fly. Ofc if you have levels made of small elements/grid then manipulating isn't necessary.

Edit: API isn't slow. Like said before, use buffers: Draw everything in memory, then draw it all in 1 call to screen. That's what OpenGL and DirectX does.