So rendering one TriangleList containing 1000 triangles is faster than 1000 TriangleLists with 1 triangle?
Definetly!
First, it's just less DirectX API overhead, next is driver overhead, and in this case (DrawPrimitive) is videocard state change overhead!