still such optimizations hardly match hand-tuned assembly....and you use that when you really need it...compiler speeding up trivial loops by unrolling and vectorization which get called once a time ain't much of an improvement for the user, it's a "show-off" for the compiler more like at least that's my opinion...definetely it's good for those who do not know assembly and cannot hand-tune stuff by themselves...