PDA

View Full Version : Help! D3DDEVTYPE_HAL and D3DCREATE_HARDWARE_VERTEXPROCESSING



cronodragon
09-11-2006, 09:20 PM
I found a weird problem. I made an application that loads a large amount of vertices, and tested in two machines. Both have NVidia cards: FX 5200 and TI 4200. When I setup the device in DirectX, I use these parameters:

D3DDEVTYPE_HAL and D3DCREATE_HARDWARE_VERTEXPROCESSING

Which are supposed to give the best performance with that hardware. In the TI card, that's the fastest configuration, but in the FX card it results to be the slower one. Then I tried these in the FX:

D3DDEVTYPE_HAL and D3DCREATE_SOFTWARE_VERTEXPROCESSING

And those work as the fastest in that card. DirectX is 9.0c, and both machines have the same version and subversion. Also the NVidia drivers are version 91.31 on both. There is no logic reason for a software vertex processing to work faster than hardware. Any idea what could the problem be? :?

Clootie
09-11-2006, 10:49 PM
Both FX5200 and GeForce 4Ti should have more or less equal vertex shader performance. So if you have significant perfromance differences then you maybe unoptimally using HW T&L and CPU in machine with 5200 is just more powerfull.

cronodragon
09-11-2006, 11:11 PM
Thanks. Then I just have to select the appropiate combination... the problem is how do I know which one works better? Both are accepted by DX without complains. :?

Clootie
10-11-2006, 12:42 AM
But actually to me it's looking that you abuse HW T&L one way or another - both cards you referenced should be able to transfotm more than 10 mln. vertices per second. Do you really have such complex objects? If not - bottleneck is probably located in unoptimal usage of GPUs.

cronodragon
10-11-2006, 12:52 AM
Well, first I was using it with software processing pushing me to optimize my engine a lot. Indeed I have fully re-written the rendering module about 4 times now!!

The geometry I'm drawing is big: a character model of >4500 vertices (not indexed), repeated 100 times. With software processing it was running just acceptable, then I changed to hardware processing and now it runs great in this machine. But when I tested it on the other machine, it ran slowly. So I switched back to software for that machine only and runs like this one... I have no clue.

I think it would be better to make a small automatic test prior to running the application in each machine to select the right combination of parameters for initialization... I think that's a normal method of professional games.

Clootie
10-11-2006, 01:18 AM
Yes, your number can be stressfull for these GPUs.

Well you have to make sure you are allocation VB in video memory for HW accelerated modes. And using indexed meshes is always win for HW T&L (this assumes that there are some common vertices shared by separate triangles - well common case with normal closed meshes). By using indexed meshes and optimizaing them for vertex cache - you can usually get about 50% more performance.

cronodragon
10-11-2006, 02:45 AM
Ok, thanks for the advices. I'll keep optimizing the engine... indeed I'll switch back to the slower processing to push me on working with better optimizations 8)