Quote Originally Posted by michalis
Actually, it would be nice to add a unit like GLMatrix that defines overloaded versions of glLoadMatrix routines that just take Tmatrix4_single/double as arguments and do the transpose inside implementation.
I like this idea. It'll make OpenGL programming a bit more high level. We could also add overloads that accept open arrays of vectors, abstracting the pointer stuff that is necessary for the *v functions.

Quote Originally Posted by michalis
I even thought about doing descendants like TMatrix4_single_GL that have methods like glLoadMatrix with implementation like

[pascal]
procedure TMatrix4_single_GL.glLoadMatrix;
begin
GL.glLoadMatrix(@transpose.data);
end;
[/pascal]

... but this will not be so nice, since you will have to override some operators again for new TMatrix4_single_GL class, and if you receive your matrix instance from some non-OpenGL-related unit then it will still have normal TMatrix4_single class, not TMatrix4_single_GL.
Well, the multiply operator could call a virtual method to do i.e. multiplication, then it should be possible to do it with only one set of operators defined. However, I don't know wether this is desired for speed reasons.