Yep.
Actually I think I'll use what SilverWarior suggested: a CLASS containing all sprite information as an ARRAY of RECORDs.
Yep.
Actually I think I'll use what SilverWarior suggested: a CLASS containing all sprite information as an ARRAY of RECORDs.
No signature provided yet.
Yep! What about TObjectList and TDictionary ?
AFAIK both TObjectList and TDictionary are dynamic and designed for easy insertion and deletion. That is slow.
The best demonstrable game performance were data-pools, where no dynamic memory management is done: just activate and de-activate static memory.
No signature provided yet.
It is not so hard to upgrade most lists into object pools.
Here is a link to a short article about this topic
https://parnassus.co/custom-object-m...un-and-profit/
I plan to spend some more time studying this subject in the future.
Yes - pre-allocate storage for data, pre-construct object instances and then just fetch them from the pools. You can grow as needed and construct double linked lists for quick insertion and deletion if required.
I usually sync my linked list operations to a hash table or spatial partitioning paradigm where appropriate as it's much faster to keep multiple structures in sync per operation than it is to keep them in sync by scanning the whole collection.
I usually add on a thread safe reference count operation that will flag the data/object as unused rather than handing the job over to reference counted interfaces. In games we only really care about getting images, geometry, sounds etc out of memory - compared to the size of that stuff we don't care about instance sizes or constant data pools that rarely extend past dozens of MB. Leave the dynamic allocation for really big data.
For data/objects that are written/read from multiple threads I wrap it in a task that gets passed along in chains across priority queues in each thread to ensure single thread access and operation order. Along with lock free queues this means I don't have to maintain slow locks for bits of data or objects - I just ensure it's impossible to be accessed at the same time (same goes for rendering, I don't lock any data or structures, I just pause the threads processing queues that might access the data the render thread needs - the so called 'render window')
Tasks can be started and stopped at any point to ensure a maximum task time per cycle or flagged as frame critical to ensure the task is complete in time to finish rendering before the start of the next V-synced frame. (so it's like a crude OS scheduler but instead of sharing time on a processor, I'm sharing the time available per frame ( - time to render the frame + jitter overhead))
This is the best way I know of handling time wasted idling in the Swap operation. Disabling v-sync is a stupid thing to do. You can't can't show more frames than the refresh rate of the screen - you only have to minimise time spent in the swap operation and time things carefully so you don't run over into the next window and cause an uneven framerate. You should be measuring performance by the time it takes to render each frame - not by how many frames per second you can push through - that doesn't tell you anything useful at all except if one computer is faster or slower than another on a given static task. Pipelines are too complex to rely on FPS as an indication during optimisation - high precision timers on actual operations are best. You can use the GL timer API to get true frame render times rather than putting a flag either side of the flush and swap - the card may of already started by then so you really want GL timers. (I'm sure DirectX and Vulkan have something similar)
Sorry, I digress. When don't I?
Last edited by phibermon; 12-07-2017 at 01:29 AM.
When the moon hits your eye like a big pizza pie - that's an extinction level impact event.
Here is a little benchmark for Delphi I did.
https://github.com/turric4n/Delphi-B...jectsVSRecords
Records are more faster than Heap allocated Records (Pointer to Records) and much more faster than Objects.
That's just what the engines I know use (Build, Action Arcade Adventure Set). And this is the way I'll use in my engine.
Like this:
Since the pointer itself is stored inside the list, there are very little CPU-cache collisions, so traverse the linked list should be quite fast.Code:TYPE TSpritePtr = ^TSPrite; TSprite = RECORD x, y: INTEGER; Bmp: ImageRef; NextSpr: TSpritePtr; END; VAR SpriteList: ARRAY OF TSprite; { Or a class containing the list, or a spezialized GENERIC or...} PlayerSpr: TSpritePtr; FirstEnemy: TSpritePtr; FirstBullet: TSpritePtr;
[edit]
Fun fact: Actually this may be faster in the old times (i.e. 8086, 80286, 80386), when memory were divided in pages (do you remember the FAR and NEAR pointers?). If the whole list is in one single memory page, no page change is needed, and memory access may be faster specially if the related code is also in the same memory page. The AAAkit book talks about it too.
Thanks pal. It confirms the hypothesis "Objects/CLASSes are the slowest".
[I must test it on FPC though, but I think it will be the same ]
Last edited by Ñuño Martínez; 13-07-2017 at 08:09 AM.
No signature provided yet.
It's tricky, I have similar tests and results depends on:
1) How many records/objects exists
2) Complexity of record/object (number or variables)
The results will be completely different with:
3) If features like sorting are neededCode:TSpriteRec = record ID : integer; x, y : integer; Name : String[80]; angle : double; alpha : byte; components : array [0..9] of integer; p : pointer; end;
In general array of pointers is fastest.
Last edited by JC_; 15-07-2017 at 11:29 AM.
Interesting.
There are a lot of variables* involved. All these tests are useful but only the actual implementation will tell what is the fastest.
_____________________________
* Pun not intended
No signature provided yet.
Bookmarks