PDA

View Full Version : Cheb's project will be here.



Chebmaster
18-01-2018, 06:26 PM
(last progress long time ago, next should hopefully be around December 2019 or January 2020)

The project's website: http://chentrah.chebmaster.com/

The full saga in Russian http://freepascal.ru/forum/viewtopic.php?f=10&t=10058

The project history: [pending update and re-checking, will re-post at the same time I upload the public Test #21]

Chebmaster
28-01-2018, 09:56 AM
http://chebmaster.com/_share/chentrah_2018_01_18.jpg


Yet another !surprise! from this newfangled string encoding auto-management.

I launched my engine compiled in fpc 3.0.4 on the mammoth coprolite I call my file server:

Chentrah version 0.21.3847 for Win32-i386,
compiled at 12:21:52 on 2018/01/28 using Free Pascal 3.0.4.
(developer mode on)
Operating System: Wine 1.3.28 / Ubuntu 11.10
User name: cheb
CPU Phenom II X2 550
x2 logical cores
level 2 cache: 512 Kbytes, line size 64 bytes
TSC invariancy: yes
TSC frequency: 3.11 GHz)


Lo and behold it crashed with "Failed to load "GL_ARB_framebuffer_object"".
Scratching the bone between my ears, I checked the extension string. Nope, the extension was there. So why?

Unearthed an old function naively assuming that String = AnsiString:

function glext_ExtensionSupported(const extension: String; const searchIn: String): Boolean;
var
extensions: PAnsiChar;
start: PAnsiChar;
where, terminator: PAnsiChar;
begin

if (Pos(' ', extension) <> 0) or (extension = '') then
begin
Result := FALSE;
Exit;
end;

if searchIn = '' then extensions := PAnsiChar(glGetString(GL_EXTENSIONS))
else extensions := PAnsiChar(searchIn);
start := extensions;
while TRUE do
begin
where := StrPos(start, PAnsiChar(extension));
if where = nil then Break;
terminator := Pointer(PtrUInt(where) + Length(extension));
if (where = start) or (PAnsiChar(PtrUInt(where) - 1)^ = ' ') then
begin
if (terminator^ = ' ') or (terminator^ = #0) then
begin
Result := TRUE;
Exit;
end;
end;
start := terminator;
end;
Result := FALSE;

end;


Corrected it to AnsiString and my engine started up, proving there is life on GeForce 7025.

But!
The wrong version was working fine in real Windows.
I can only assume that that fossilized wine did not have some mechanism the auto-recoding of the FPC RTL relies on.
That's another thing to watch out for, I suppose.

Chebmaster
12-12-2019, 08:50 PM
Cheb's Game Engine is that eternally worked on thing that never releases its next build.

Today, seeing the compilation process complete, I roared like a bear overcoming horrible constipation: I spent a full year refactoring my code. A full year!

Ahhhhh...

A debugging hell lies ahead of me but all I could feel is relief. Finally!

The previous build was released at December 17, 2016 (please don't go looking at it: I am horribly ashamed of that mess by some mistake called my sources).

The last year I looked at Fidel Castro's example and gave a vow to never shave or trim my beard until my engine goes past the rotating cube stage. I am now forced to resort to dirty life hacks like putting my t-shirt over my beard so that it stays stuffed down my collar to look presentable :(...

My track record of making side trips, each holding me back for a year or two:
2007: Linux support
2009: Migrating from OpenGL ~1.4 to OpenGL 2.1 (later GL ES 2)
2010: Tired of never getting anywhere, almost abandoned the project
2013: Changing architecture to multi-threaded for multi-core CPU support
2015: x86-64 support (still not finished, dammit, requires FPC 3.2 released to continue)
2016: Raspberry Pi support
2018: Epic refactoring of my horribly dated ODBMS I created back in 2006

JernejL
13-12-2019, 06:35 AM
Ah yes.. strings! i've done migration of my game to widestrings everywhere 2-3 years ago, i still feel some pain, but it was worth it in the end!

Same with "epic refactoring" - after converting my project from delphi 7 to freepascal, i've done so much refactoring and consequentially simplifications - make use of operator overlading, generics and methods in records, code writes a hell lot cleaner in modern pascal :)

You have interesting way of building your engine, especially that you can keep resources loaded and re-load the game logic, i assume it's done via sort of dll mechanism?

Chebmaster
13-12-2019, 02:15 PM
Yes, indeed. All game logic resides in a module DLL hosted by the mother EXE. When the DLL unloads it uses the same serialization mechanism used for saving to selectively gather all asset classes (owning OpenGL handles and such) and store them into a memory stream owned by the EXE.
The process is not as simple as it looks (because of interdependent nested assets like fbos and their textures I have the assets loaded from the save "devour" their counterparts received from the mother taking over their handles) but I had it working almost perfectly before the rehaul.

Generics, turns out I have invented them as well, back when fpc 1.x didn't have dynamic arrays yet but already had modern classes. Worked via tricky includes and preprocessor. I plan to re-do the static remains of those classes, still used everywhere, using real generics. But that is a secondary task.

What I was going to do before rehauling my ODBMS was replacing the old, horribly awkward module switching GUI in the EXE with the standard GUI that is available to the DLL, making them into a specialized module DLL of their own thus leaving the EXE a dumb container only able to render console. And because I interrupted that task half-way in October 2018, I forgot what I was going to do and where I stopped. Now my engine finally runs... rendering a blank screen and not responding to inputs.
Where do I start... Boo-hoo-hoo...

Chebmaster
16-12-2019, 03:34 PM
During debugging, encountered a grievous documentation error:

function TChepersyMemoryManagerChunk.Alloc: pointer;
var
i, k: integer;
j: cardinal;
m: ptruint;
begin
i:= 0;
// the mask bits of non-valid indexes are pre-set to 1, see the constructor
for i:= 0 to High(f_AllocMask) do begin
m:= not f_AllocMask[i];

if m = 0 then continue;
// https://www.freepascal.org/docs-html/current/rtl/system/bsfdword.html
// incorrectly states that BsfDWord returns 255 if no bits are set
// while in fact it returns 0! (at least in fp 2.6.4 it does)
j:= {$ifdef cpu64}BsfQWord( {$else}BsfDWord( {$endif} m );
addlog(' i=%0, j=%1, k=%2, mask=%3',[i,j,k,pointer(f_AllocMask[i])]) ;
// if j < 255 then begin
k:= (i * 8 * sizeof(pointer)) + j;
Assert((k >= f_IdxLow) and (k <= f_IdxHigh)
, 'TChepersyMemoryManagerChunk.Alloc: index ' + IntToStr(k)
+ ' is out of bounds (' + IntToStr(f_IdxLow) + ','
+ IntToStr(f_IdxHigh) + ')');
Inc(f_AllocCount);
Dec(f_FreeCount);
if f_FreeCount = 0 then CpsMemoryManager.OnChunkBecomingFull(Self);
f_AllocMask[i]:= f_AllocMask[i] or (ptruint(1) shl j);
Exit(pointer(ptruint(Self) + ptruint(k) * f_Size));
// end;
end;
Die(MI_ERROR_PROGRAMMER_NO_BAKA, [
'TChepersyMemoryManagerChunk.Alloc algorithm fail']);
end;

..aaand I'd say this is more than just twitching:
http://chentrah.chebmaster.com/images/cge/shotoftheday_2019_12_16.png
YEEEE-HAW! :D

Chebmaster
16-12-2019, 05:48 PM
elaborating:

program test;
begin
WriteLn(BsfDword(0));
WriteLn(BsfQWord(0));
WriteLn({$I %FPCVERSION%});
end.


d:chentrahmodulestests>c:FPC2.6.4bini386-win32fpc bsfdword.pas
Free Pascal Compiler version 2.6.4 [2014/03/06] for i386
Copyright (c) 1993-2014 by Florian Klaempfl and others
Target OS: Win32 for i386
Compiling bsfdword.pas
Linking bsfdword.exe
7 lines compiled, 0.1 sec , 25616 bytes code, 1628 bytes data

d:chentrahmodulestests>bsfdword
0
4231860
2.6.4

d:chentrahmodulestests>c:FPC3.0.4bini386-win32fpc bsfdword.pas
Free Pascal Compiler version 3.0.4 [2017/10/06] for i386
Copyright (c) 1993-2017 by Florian Klaempfl and others
Target OS: Win32 for i386
Compiling bsfdword.pas
Linking bsfdword.exe
7 lines compiled, 0.1 sec, 25424 bytes code, 1252 bytes data

d:chentrahmodulestests>bsfdword
255
255
3.0.4

Chebmaster
14-01-2020, 03:22 PM
This mighty Cheb finally wrangled unicode paths into submission!

I've successfully ran my engine in GL ES 2 mode from a folder named D:\/人◕‿‿◕人\ while it contains ANGLE DLLs ripped from old Firefox, which are NOT unicode - i.e. they crash and burn if loaded from a path not representable in the system 8-bit encoding (CP1251 in my case).

This is how:
function GetAnsiSafePath(s: TFileNameString): TFileNameString;
{$ifndef windows}
begin
Result:= s;
end;
{$else}
// ASSUMING that the file name is safe anyway
var
u, b: UnicodeString;
reqlen: dword;
fn, pt: TFileNameString;
a: Array of UnicodeString;
i: integer;
begin
if Length(s) = 0 then Exit(s);
fn:= ExtractFileName(s);
// first, optimize the path correcting slashes and collapsing all '\..\'
pt:= OptiPath(ExtractFilePath(s));
u:= FileNameToUnicode(pt);
if IsPathAnsiSafe(u) then Exit(s);
a:= Explode('\', u);
u:= a[0] + '\'; // assuming its the drive letter, not checking
for i:= 1 to High(a) - 1 do begin
u+= a[i] + '\';
if IsPathAnsiSafe(u) then continue;

reqlen:= GetShortPathNameW(@u[1], nil, 0); //msdn sayd NOT INCLUDING
// the terminating #0 but then where does that extra space
// come from?
if reqlen = 0 then begin
GetLastError; // clear the error message
Exit(s);
end;
SetLength(b, reqlen); // automatically creates extra space for the terminating zero
SetLength(b, GetShortPathNameW(@u[1], @b[1], reqlen + 1));
if Length(b) = 0 then Exit(s);
u:= b;
end;
Result:= UnicodeToFileName(u) + fn;
end;
{$endif}

, where OptiPath is my custom path parser that collapses relative paths containing '\..\', UnicodeToFileName and so on are from my chtonic patch for fpc where TFileNameString = Utf8String even in Windows (I don't use lazarus libraries, implemented everything on my own) and IsPathAnsiSafe is this:

function IsPathAnsiSafe(u: UnicodeString): boolean;
{$ifndef windows}
begin Result:= Yes end;
{$else windows}
var
a: UnicodeString;
b: AnsiString;
i: integer;
res: longbool;
ac: AnsiChar;
begin
if Length(u) < 1 then Exit(Yes);

res:= false;
i:= WideCharToMultiByte(CP_ACP, WC_COMPOSITECHECK or WC_DISCARDNS,
@u[1], length(u), nil, 0, nil, nil);
if i < 1 then Exit(Yes); // graceful degradation
SetLength(b, i);
ac:= #7;
WideCharToMultiByte(CP_ACP, WC_COMPOSITECHECK or WC_DISCARDNS,
@u[1], length(u), @b[1], i, @ac, @res);
Result:= Yes;
for i:= 1 to Length(b) do
if ord(b[i]) < 32 then begin
Result:= No;
break;
end;
end;
{$endif windows}

Aaaand, it worked!

Loading d:\84BC~1\3rdparty\ANGLE\win32\libGLESv2.dll...Ok, d:\84BC~1\3rdparty\ANGLE\win32\libGLESv2.dll
Loading d:\84BC~1\3rdparty\ANGLE\win32\libEGL.dll...Ok, d:\84BC~1\3rdparty\ANGLE\win32\libEGL.dll
Loading the procedure addresses from the GL ES DLL ...
glActiveTexture() at 5F1742C0h in d:\84BC~1\3rdparty\ANGLE\win32\libGLESv2.dll
glAttachShader() at 5F1742C5h in d:\84BC~1\3rdparty\ANGLE\win32\libGLESv2.dll
glBindAttribLocation() at 5F1742CAh in d:\84BC~1\3rdparty\ANGLE\win32\libGLESv2.dll
glBindBuffer() at 5F1742CFh in d:\84BC~1\3rdparty\ANGLE\win32\libGLESv2.dll

P.S. I had unhealthy obsession with redefining true and false as "Yes" and "No". I still follow this in my engine as it became standard.

P.P.S. This was a big problem in WinXP as it used user's name for that user's home directory. Use one non-unicode character and suddenly lots of software crash on you. Including anything compiled in fpc.
I'm not sure if later windozes resolved that.

Chebmaster
22-03-2020, 09:20 AM
Trying trunk 3.3.1 aka the larva of 3.2.

I was glad to see the exception handling in threads created by a DLL is working now out of the box. Interestingly, it seems having no conflicts with existing Win32 SEH handlers (as each driver DLL you load likely installs its own, I tested and confirmed that) but RTL 3.3.1 doesn't install its own handler.

I wasn't able to figure it out *how* exception handling of 3.3.1 works, it's a black box for me now.

Test:
program thrtesta;
{$mode objfpc}
{$apptype console}
{$longstrings on}
uses
{$ifdef unix}
cthreads,
{$endif}
SysUtils,
Classes
{$ifdef unix}
, dl
{$else}
, windows
{$endif}
;
type
TTestThread = class(TThread)
protected
procedure Execute; override;
end;

procedure TTestThread.Execute;
begin
WriteLn('> A');
try
byte(nil^):= 0;
except
WriteLn('exe thread ID=',GetCurrentThreadId()
,' catch: ',(ExceptObject as Exception).Message);
end;
WriteLn('< A');
end;

function PCharToString(P: PAnsiChar): Utf8String;
var
i: integer;
p2: PAnsiChar;
begin
if not Assigned(p) then Result:= ''
else begin
p2:= p;
i:= 0;
While p2^ <> #0 do begin
inc(p2);
inc(i);
end;
SetLength(Result, i);
MOVE(p^, Result[1], i);
end;
end;

var
t: TTestThread;
dllhandle: {$ifdef unix} pointer {$else} THandle {$endif};
mypath: string;
{$ifdef unix}
ufn, upn: Utf8String;
{$else}
wfn: UnicodeString;
wpn: AnsiString;
{$endif}
thrproc: procedure; cdecl = nil;

begin
WriteLn('the exe is built using fpc '
,{$I %FPCVERSION%},'/',{$I %FPCTARGETOS%},'/',{$I %FPCTARGETCPU%});
WriteLn('main thread ID=', GetCurrentTHreadId());
t := TTestThread.Create(False);
WriteLn('exe thread created');
try
t.WaitFor;
finally
t.Free;
end;
WriteLn('exe thread terminated');
WriteLn('loading the DLL...');
mypath:= ExtractFilePath(ParamStr(0))
+ {$ifdef unix} 'libthrtestb.so' {$else} 'thrtestb.dll' {$endif};
WriteLn('path is ', mypath);
{$ifdef unix}
ufn:= mypath;
dllhandle:= dlopen(PAnsiChar(nu8), RTLD_NOW);
if not Assigned(DLL) then begin
WriteLn('failed to load: ',PCharToString(dlerror()));
Halt(0);
end;
upn:= 'thrproc';
pointer(thrproc):= dlsym(dllhandle, PAnsiChar(upn));
{$else}
wfn:= mypath;
SetLastError(0);
dllhandle:= LoadLibraryW(PUcs2Char(wfn));
if dllhandle = 0 then begin
WriteLn('failed to load.');
Halt(0);
end;
wpn:= 'thrproc';
pointer(thrproc):= windows.GetProcAddress(dllhandle, PAnsiChar(wpn));
{$endif}
if not Assigned(pointer(thrproc)) then begin
WriteLn('failed to load the procedure.');
Halt(0);
end;

WriteLn('invoking the dll...');
try
thrproc;
except
WriteLn('exe thread ID=',GetCurrentThreadId()
,' catch: ',(ExceptObject as Exception).Message);
end;
WriteLn('unloading the dll...');
{$ifdef unix}
dlClose(dllhandle);
{$else}
FreeLibrary(dllhandle);
{$endif}
WriteLn('done.');
end.

library thrtestb;
{$mode objfpc}
{$apptype console}
{$longstrings on}
uses
{$ifdef unix}
cthreads,
{$endif}
SysUtils,
Classes;
type
TTestThread = class(TThread)
protected
procedure Execute; override;
end;

procedure TTestThread.Execute;
begin
WriteLn('> X');
try
WriteLn('> Y');
try
WriteLn('> Z');
try
byte(nil^):= 0;
except
WriteLn('dll thread ID=',GetCurrentThreadId()
,' catch in block Z: ',(ExceptObject as Exception).Message);
end;
WriteLn('< Z');
except
WriteLn('dll thread ID=',GetCurrentThreadId()
,' catch in block Y: ',(ExceptObject as Exception).Message);
end;
WriteLn('< Y');
except
WriteLn('dll thread ID=',GetCurrentThreadId()
,' catch in block X: ',(ExceptObject as Exception).Message);
end;
WriteLn('< X');
end;

procedure MyMainProc; cdecl;
var t: TThread;
begin
WriteLn('the dll is built using fpc '
,{$I %FPCVERSION%},'/',{$I %FPCTARGETOS%},'/',{$I %FPCTARGETCPU%});
try
t := TTestThread.Create(False);
WriteLn('dll thread created');
try
t.WaitFor;
finally
t.Free;
end;
except
WriteLn('dll thread ID=',GetCurrentThreadId()
,' catch in main proc: ',(ExceptObject as Exception).Message);
end;
WriteLn('the dll is done.')
end;

exports
MyMainProc name 'thrproc';
begin
// do nothing.
// The initialization sections DO NOT WORK in Linux for DLLs. And never will.
end.

Still a crash-to-desktop in 3.0.4:
the exe is built using fpc 3.0.4/Win32/i386
main thread ID=6444
exe thread created
> A
exe thread ID=6736 catch: Access violation
< A
exe thread terminated
loading the DLL...
path is d:chentrahmodulesteststhrtestb.dll
invoking the dll...
the dll is built using fpc 3.0.4/Win32/i386
dll thread created
> X
> Y
> Z
An unhandled exception occurred at $1000165F:
EAccessViolation: Access violation

Works perfectly in 3.3.1, the 17-year old bug finally closed:
the exe is built using fpc 2.6.4/Win32/i386
main thread ID=780
exe thread created
> A
exe thread ID=6232 catch: Access violation
< A
exe thread terminated
loading the DLL...
path is d:chentrahmodulesteststhrtestb.dll
invoking the dll...
the dll is built using fpc 3.3.1/Win32/i386
dll thread created
> X
> Y
> Z
dll thread ID=6076 catch in block Z: Access violation
< Z
< Y
< X
the dll is done.
unloading the dll...
done.

On the engine front, I postpone trying 3.3.1 to autumn 2020 (sadly also postponing x86-64 support that requires it) and concentrate on making my engine usable using 3.0.4.
The reason is difference in RTTI that necessitates making adjustments to my persistency system (and debugging them, dammit!)


It will be ready in a month or two, with a simple asteroids game so that I could finally shave after all those years.

Chebmaster
18-04-2020, 01:28 PM
I give too little time to this project. :(

Made my engine compile and work using 3.2.0-rc1, check. Pain in the butt with strings, you *cannot* use the "+" operator to concatenate them, they are guaranteed to be wrecked horribly.

Now bringing Win98 support back. Restored a lot of neglected/commented out code designed to work in 98: it has so many vital WinAPI functions missing! I made a dedicated conditional CGE_PLATFORM_HAS_WINDOWS98 and wrap any relevant code in it. Because that would only work if the mother exe and the module DLL are compiled using fpc 2.6.4 (because lots of WinAPI functions have to be loaded manually, a program built using fpc 3 would crash in Win98 due to unsatisfied dependencies). So I have to support fpc 2.6.4 for all eternity patching all of its bugs by myself (the lack of Unicode support, the inability to catch AVs in a DLL, the incorrectly working functions like BsfDword - I have successfully patched them all in my engine. Took a lot of effort and dedication).

The mother EXE for Win32 will always be compiled using fpc 2.6.4, with optimization set to pentium 3/x87 to keep compatibility with Athlon XP/Pentium III. The module DLL will normally be compiled using fpc 3.2/pentium m/sse2 but there will always be a separate legacy version built using fpc 2.6.4/pentium 3/x87

Scavenged me a Radeon 9800 Pro 128Mb AGP (it has a molex power connector, LOL). Downloaded specific Catalyst for it to work in 98. Now I only have to install it into that machine and actually install Win98 on it.

Examples of dirty tricks Win98 support requires:

function TWinApiFramework.GetScreenRect(fullscreen: boolean): TWindowManagerRect;
var
rc: TRect;
Monitor: HMONITOR;
mi: TMonitorInfo;
begin
{$ifdef CGE_PLATFORM_HAS_WINDOWS98}
if (Mother^.State.OS in [ostWin98, OstWin2k])
and (not Assigned(GetMonitorInfo) or not Assigned(MonitorFromRect))
then with Result do begin
//Use legacy method fow windozes older than XP
left:= 0;
top := 0;
width:= GetSystemMetrics(SM_CXSCREEN);
height:= GetSystemMetrics(SM_CYSCREEN);
//Assuming that taskbar is at the bottom and is 28 pixels high:
if not Fullscreen then Height-= 28;
end
else
{$endif}
begin
if windowhandle = 0 then begin
//Not created yet. Use default monitor.
with rc do begin
left:= 0;
top:= 0;
right:= 1;
bottom:= 1;
end;
Monitor:= MonitorFromRect(rc, MONITOR_DEFAULTTOPRIMARY);
end else begin
//get monitor from window
GetWindowRect(windowhandle, rc);
Monitor:= MonitorFromRect(rc, MONITOR_DEFAULTTONEAREST);
end;
mi.cbSize:= sizeof(mi);
GetMonitorInfo(Monitor, @mi);
if fullscreen
then rc:= mi.rcMonitor
else rc:= mi.rcWork;
Result.left:= rc.left;
Result.top:= rc.top;
Result.width:= rc.right - rc.left;
Result.height:= rc.bottom - rc.top;
end;
end;

de_jean_7777
18-04-2020, 02:14 PM
That's a lot of dedication to support windows 98. I was thinking about it too, but it may make more sense to eventually try to add a win9x platform back into fpc 3.2 rather than use 2.6.4, if at all possible. Since it can work for DOS, I assume it should be able to work for Windows 9x. Since I depend on generics a lot, I can't go back below 3.0.4. But this is not a priority for now, or maybe ever.

Chebmaster
18-04-2020, 02:44 PM
Well, I started developing my engine back in 2003 when I was still using Win98. I kept going at it when I finally moved to Win2k in 2004.


but it may make more sense to eventually try to add a win9x platform back into fpc 3.2
Not a chance.
Why? Because it would require making Windows unit use dynamic loading, making a lot of the functions declared there procedural variables.
The Windows unit in 2.6.4 was effectively *crippled* because of Win98 support in that compiler version. It misses a lot of really useful functions you have to add by yourself if you use 2.6.4.

I also believe a lot of string handling routines would have to be adapted to detect Win98 and hack around its limitations... which is not always possible, because how do you determine the system locale? It's not always possible. I use a dirty hack "if no Unicode support found assume Russian/cp1251", such things would not fly for the fpc rtl.

There are too few people interested in Win98.

WinXP, on the other hand... But WinXP already has all of the WinAPI functions there and working -- except maybe some exotic ones endemic to Win7/8/10. But these should be exceedingly rare.

Also, there is an abyss of time between WinXP (supported til 2014) and Win98 SE installed from an original CD of 1999 I am aiming to support.

Chebmaster
19-10-2020, 05:29 AM
Postponed again, due to one definitely, decidedly, I swear I'll never do that again, rehaul.

1. Compatibility with fpc 2.6.4 dropped forever (now requires 3.2 on Windows and 3.0.4 on Linux)
2. Support of Windows 98 dropped forever (now requires XP and up).
3. My unique split architecture is limited to developer mode only, which is limited to Win32 only. Now by default the modules are parts of the main executable.
Reason: exception handling in threads created by DLLs still does not work in Linux, requiring an exception hack. Doable (I made one for Win32, after all) but why waste so much effort?
Also saves extra effort on not having to interface a lot of bells and whistles things between the mother exe and the module dll (which is a royal pain because you have to dumb everything down to a C-like API with pointers instead of arrays and strings). Lots of purely cosmetic things are cut out of the devmode DLL now. The DLL doesn't know anything about mother's modules list, on clicking module select it passes control to the hub module built into the exe. It can't have custom loading screen background. And so on.

P.S. Just look at that double wrapper which allows the DLL working with TSTream instances created by the mother executable. First it is cast to ptruint and called "handle" and a small API made to access its methods via flat cdecl functions accepting handle as one of the parameters. Then, on the DLL side, a class is derived from TStream where methods call on that mother API. Results in a perfectly working TStream on both sides, but implementing it was so much pain :(

Chebmaster
20-03-2021, 11:59 PM
An embarrassing oopsnik about forgetting edge cases.
I was using per-thread records to store exception data, error messages & backtraces, self-profiling counters.
Instead of threadvar it was organized as a linked list so every time pointer to it was asked, the chain was searched for one with the right thread id.
Last Sunday I finally redid it using threadvar.
I did not change the API, the function is still called and still allocates the record the first time it is asked for. It just uses threadvar to store and retrieve it. But the linked list is still used, reserved for exactly two cases: the module's thread manager collating error messages from all threads at shutdown and the histogram renderer, which iterates over all known threads.
Looking at histograms made me realize the blunder causing uncaught exceptions on exit.
https://chentrah.chebmaster.com/images/tmp/chentrah_pes_oops.png
See that mysterious unnamed thread and the main thread's bar being empty?
That's where I forgot that the record for the main thread should not be allocated via New, the pointer must point to a specific global variable which is the start of the linked list.

I also had enough with the horribly convoluted and distributed state machine of the control panel (which I am still trying to wrangle into a separate built-inmodule) playing tricks and taking me on wild goose chases.
I nuked it and am now re-making it from scratch. As I should have from the start.

Also, I decided to make the control of the verbose log (aka verbal diarrhea mode) granular instead of "on" and "off". Finding points of interest in kilometer-long walls of text was becoming tiresome. Soon I will be able to switch verbose logging separately for gapi, database, framework, path processing, loading external function pointers and so on.
The debugging executable version would be left for cases of extreme debugging.

Ñuño Martínez
21-03-2021, 10:45 AM
I'm having problems with my engine logs too, so I'll redesign it. I think I'll define a empty "Log" class that just skips logs and extend it in a class that actually writes the log files. Then you can create as many log objects you want (maybe just an empty one) then you can assign them to each engine's subsystem. This way you can organize them as you want. For example, if you want no logs, assign the empty one to all; if you want a different log file per subsystem, create as many as you need; for only one big huge file, create one and assign it.

Chebmaster
27-03-2021, 09:44 AM
Then you can create as many log objects you want (maybe just an empty one) then you can assign them to each engine's subsystem.
Nice :)
I couldn't do that, simply because I am restricted to procedural APIs: the devmode module DLL cannot use classes of the mother executable.

For me, there's also a question of performance, any logging has to be wrapped into
if Verbose then AddLog(...)
because sometimes I stick logging in time-sensitive areas.
If I do simple
VerboseLog()
instead, which checks conditions on its own, that's still a lot of passing around strings, "array of const", maybe IntToHex here and here - and hello, slowdown.

So as I do that anyway, I just upgraded from
if Mother^.Debug.Verbose
to
if Verbose(<condition>)
, which simply checks an enum against a set.

for example,

if Verbose(vla_Chepersy) then AddLog(
'Freezing the database state, the memory manager has %0Mb committed'
+ ' in %1 blocks'
,[Cps.MemoryManager.TotalAllocatedSysMem div (1024 * 1024),
Cps.MemoryManager.NumBlocks]);
MOVE(chepersy.Cps, SavedChepersyState, sizeof(chepersy.Cps));
FillChar(chepersy.Cps, sizeof(chepersy.Cps), 0); // NO deinit!


But! More importantly, I finally found a good alternative to LGPL, which I used since the very beginning in 2006, and switched my engine to MPL 2.0
It allows what I am aiming for (static linking with map scripts of questionable reputability auto-converted from the ACS language to Pascal) while still allowing to use my code in a GPL project if someone so wishes.
Turns out I had accumulated lot of third party code, with a zoo of different licenses, all of that had to be reviewed and addressed.

Chentrah
Copyright (C) 2002-2021 ChebMaster (user5543@chebmaster.com)

The source code is covered by following licenses:
The OpenGL and OpenGL ES 2.0 headers
under SGI Free Software License B

The JEDI DirectSound headers
under dual MPL 1.1 and LGPL

The Broadcom library headers (c) 2012, Broadcom Europe Ltd
under custom free permission
- see un_gles_raspberry_pi.h

The Vampyre Imaging library
(see the 3rdparty folder)
under dual MPL 1.1 and LGPL

The modified part of paszlib (un_unzip.pp)
under custom free permission
- see LICENSE-paszlib.txt

The modified parts of Free Pascal RTL (various parsers of debugging info)
under modified LGPL 2.1 allowing linking into executables regardless of license
- see LICENSE-FPC.txt

The rest of the engine
under MPL 2.0 - which allows use under GPL/LGPL 2.1 or later

If some source file does not contain license information
it means I forgot to embed the MPL2.0 notice.
Poke me with a stick.


Please see how I handled a dwarf parser unit borrowed from RTL and tell me I didn't do it wrong

{
Modified by ChebMaster in 2018 to integrate into Chentrah

This file is part of the Free Pascal run time library.

Copyright (c) 2006 by Thomas Schatzl, member of the FreePascal
Development team
Parts (c) 2000 Peter Vreman (adapted from original dwarfs line
reader)

Dwarf LineInfo Retriever

See the file LICENSE-FPC.txt, included in this distribution,
for details about the copyright.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

************************************************** ********************}


This is the file LICENSE-FPC.txt, it applies to modified fragments of
the Free Pascal Run-Time Library (RTL)
distributed by ChebMaster (user5543@chebmaster.com).

The source code of the Free Pascal Runtime Libraries and packages are
distributed under the Library GNU General Public License
(see the file LICENSE-LGPLv2.txt) with the following modification:

As a special exception, the copyright holders of this library give you
permission to link this library with independent modules to produce an
executable, regardless of the license terms of these independent modules,
and to copy and distribute the resulting executable under terms of your choice,
provided that you also meet, for each linked independent module, the terms
and conditions of the license of that module. An independent module is a module
which is not derived from or based on this library. If you modify this
library, you may extend this exception to your version of the library, but you are
not obligated to do so. If you do not wish to do so, delete this exception
statement from your version.

If you didn't receive a copy of the file LICENSE-LGPLv2.txt, contact:
Free Software Foundation
675 Mass Ave
Cambridge, MA 02139
USA

Chebmaster
02-04-2021, 06:12 AM
P.S. More elaboration on dropping Win98 support (after investing so much effort, sniff).

Alas, the said support became an unsustainable effort sink. All for a vanishingly small fraction of hardware that
1. is compatible with Win98
2. has Directx9 - class video card
3. has OpenGL 2.0 drivers for Win98 for said video card

I should have done this much, much earlier. It was Ok while my target was OpenGL 1.2 or even when I updated that to 1.4. But when I mage my absolute minimum GLES2 and its desktop substitute GL 2.1 it was time to bury the venerated dead.

The last nail into coffin was my decision to drop support for video cards without NPOT2 support (I have one, GeForce FX5200). Incidentally, FX5200 was the only video card I have that has Win98 OpenGL 2.0 drivers.

As a result I am free to drop supporting Free Pascal 2.6.4 (hello, generics, I missed you so much)

Rest assured, I am still supporting WinXP. I cannot find reasons not to.

de_jean_7777
03-04-2021, 12:44 PM
P.S. More elaboration on dropping Win98 support (after investing so much effort, sniff).
Yeah, it's not worth it. WinXP is still supported by FPC, so that works more or less with no effort. My engine will support gl 1.2 as one of the renderers, so there's hope I'll be able to port it for older platforms, but really not a priority.

Ñuño Martínez
06-04-2021, 10:17 AM
P.S. More elaboration on dropping Win98 support (after investing so much effort, sniff).

Alas, the said support became an unsustainable effort sink. All for a vanishingly small fraction of hardware that
1. is compatible with Win98
2. has Directx9 - class video card
3. has OpenGL 2.0 drivers for Win98 for said video card

I should have done this much, much earlier. It was Ok while my target was OpenGL 1.2 or even when I updated that to 1.4. But when I mage my absolute minimum GLES2 and its desktop substitute GL 2.1 it was time to bury the venerated dead.

The last nail into coffin was my decision to drop support for video cards without NPOT2 support (I have one, GeForce FX5200). Incidentally, FX5200 was the only video card I have that has Win98 OpenGL 2.0 drivers.

As a result I am free to drop supporting Free Pascal 2.6.4 (hello, generics, I missed you so much)

Rest assured, I am still supporting WinXP. I cannot find reasons not to.
Curious: I'm doing exactly the opposite. Too soon to announce though... ::)

Chebmaster
10-04-2021, 09:03 AM
In the end it all depends on system requirements. When I learned GLSL and realized how much freedom it gives I was firmly set on using it. I also implemented physics in a separate thread so multi-core CPUs became a great bonus if not outright requirement.

This resulted in my target hardware being a Core 2 Duo + a DirectX9c class video card. That's around 2007. While Windows 98 with drivers available to it barely reaches 2004.

Chebmaster
08-06-2022, 08:22 AM
It seems COVID did not simply damage my brain making me tire out rapidly. It also dislodged thought processes long ossified in their infinite loops.
I looked back at my code and was horrified. Those paths led me astray, into deep madness and unsustainable effort, already feeding a downward spiral of disheartening and procrastination.
The best excuse I could come with was "Tzinch had beguiled me".

So I am now performing a feature cut with all the energy of an industrial blender. Returning the paradigm back to the simplicity it had in the early 2010s while salvaging the few real good things I coded since.

* switching modules in-engine has to go. Every module (game, tool) will have its own mother executable with release version of game logic built in.
* only one thread for logic, ever. No ""clever hacks"" to utilize more cores by loading several copies of the same dll. My beard is half gray already and i am still nowhere. The crawling featurisis has to stop *now*.
* the reloadable-on-the-fly DLL is for debugging only. No stripping. The complicated mechanism of storing and processing debug info for it has to go.
* the GUI dealing with logic crashes must have no supoport for the debugging DLL crashing and resolving it. The simple console will do.
* the mother GUI need not have support for the DLL loading progress. The simple console will do.
* the debugging DLL uses the mother executable's memory manager, GREATLY simplifying the mother API. No more need for converting arrays to pointers and strings to PChar and back on the other side.
* the debugging DLL is always version-synced with the mother executable. No more need for complicated version checks and compatibility mechanisms.
* all assets are owned by the mother executable (hello, TMap specialization) thus turning the asset juggling phtagn of old into a complex but manageable mechanism. In the same vien directories and pk3 files are also owned by the mother executable. Need to refresh assets? Restart the exe.

And a major new feature
* all calls to the GAPI (GLES2 or GL 2.1 with extension) are made via new abstraction layer that borrows heavily from Vulkan.

All in all i hope to have the rotating cube back this autumn :D And then, finally, MOVE FORWARD for the fist time since 2014.

Jonax
09-06-2022, 12:45 PM
Sounds like a good plan. SImplify and move forward. :) I look forward to see the rotating cube

Chebmaster
09-06-2022, 02:25 PM
Me too :) Can't wait :(...

I cut one more unnecessary thing. My unconventional developer mode is revolutionary (for 2008 when it was conceived it would have been world-changing) BUT it is also *horrifyingly* costly in man-hours to get it up and running on each platform.

By abandoning it for *all* platforms except win32, I make completing the current refactoring possible during my lifetime.

I spent around a year of my life building foundations for the megahack that worked around exception handling not working in DLLs in fpc 2.6.4 -- but that bug was closed in 3.2 requiring NO such effort. Still no luck in Linux, but then my future Linux and win64 versions won't have any DLLs at all. Only one release build in one executable using one thread for logic.

Enough that I *could* have moved forward with full RPi support as far back as 2016 -- if not for the fact that fpc 2.6.4 for arm was unable to generate working DLLs and I stalled waiting for 3.0x, then procrastinated, slowing down... Had I made this reasonable decision back then... I'd probably have a working game now (even if simple asteroids test). And would not have had that close brush with depression, too.

The quote of my favorite writer applies, "a bullet wonderfully clears your brain even when hits you in the ass" but why did I have to eat covid to realize such simple things?
Trying to learn being more flexible.

On a positive note I finally wrangled the .BAT syntax into submission and redid my entire build.bat for the new paradigm. Short story: use SETLOCAL ENABLEDELAYEDEXPANSION and !MYVAR! instead of %MYVAR% lest woe betide you.

Jonax
11-06-2022, 10:09 AM
I haven't used DLLs in ages. In fact DLL troubles was one reason motivating me to switch from Visual Basic to Pascal (Delphi) back in the 90ies. I So far I haven't had to use any for the Pascal. But I've only done simple stuff.

Anyway.. getting rid of them DLLs sounds good. And concentrating efforts on fewer and perhaps simpler areas sounds good too. Though that RPi is an interesting and promising platform. And from my point of view win32 is the platform I use the least nowadays.

Anyway I look forward to see more Cheb stuff in the future. Keep up the good work. I'll try to get my act together and create some new programs too. Though I'm afraid it tends to get delayed.

Chebmaster
19-02-2023, 12:48 PM
I am currently amidst an immense re-haul that changes the very architecture. Hopefully by the end of this year (2023) it would be over and I could move on with creating my first game.

Previously:
My "killer feature", as envisioned back in 2005, was what has since shrunk to "developer mode": all relevant code resides in a DLL that could be re-compiled and re-loaded without re-starting the engine and re-loading assets.

I invested about 4 years total into my database engine 2006-2008 and the asset management (2012-2013) that linked game code with assets stored in the "mother executable".

The common parts of architecture that will remain as is:
- the mother executable has an API - a monolithic record of fields and functions in procedural variables serving the game code's gateway into the engine. Includes configs, window manager, and a dual-layer wrapper allowing the DLL using mother executable's streams as a TStream of its own.
- the database works on "save a snapshot to TStream" principle, with perfect reproduction of the logic on load.
- assets are identified by an unique hash (was 256 bits, reduced to 128), either randomly generated or a md5 of file name.

The old architecture:
the DLL had a logic thread for the object database (single-threaded by design) with assets being classes of it. The DLL managed background tasks, the logic classes had access to graphical API (OpenGL/GL ES) and had methods for rendering in the main thread. Locking was employed to prevent database from crashing while the render routine was executing an a thread not of its own. On unloading, all assets were counted and packed into a separate mother stream, which for the mother executable was a banal TMemoryStream. On loading, logic had to retrieve that list (which could be empty, if it was the first start, or containing mismatching assets from a different run, in case of switching the session). Each asset object then had to employ convoluted algorithm of devouring its stored counterpart, absorbing properties and OpenGL handles or discarding them. Which, in case of hierarchical multi-part assets like FBOs, was turning into a nightmare.

It's no surprise that me development stalled and my phtagn asset manager was plagued by bugs very hard to catch (as everything was split into inter-dependent tasks running in background threads).

The new architecture (I'm cutting and cutting it down):
- it's not 2005 anymore, I am developing from a SSD.
- no more "universal" mother that can run any of the games/tools. There is one mother executable per each game/tool (one release, one debug with assertions on).
- the DLL is only used in the "developer mode" only available for x86 Win32. The normal mode of operation, and all other platforms, is logic built in into the main executable. No more agony of building DLLs for Linux.
- the DLL runs in the logic thread created by the mother and that's all. The DLL never uses any other threads.
- my new rendering architecture Kakan: the logic fills a command list in its logic thread, abstracted from any APIs, then passes it for execution and forgets it. The rendering in the main thread is done by Kakan. The logic loses all access to OpenGL.
- assets are mother executable's classes, accessible to the logic as untyped pointers. Any specific details are exposed via pointers to T<XXX>Innards records shared between the mother and the DLL. Mother API's ExposeInnards method returns an untyped pointer. The logic's job is to type-cast it to correct P<XXX>Innards. Ugly, but I saw no other way to make it simple enough.
- logic has its own classes for linking to assets, derived from TAbstractAssetLink. All begin with a pair (pointer + hash), where pointer (Mother's asset class instance) is never saved with the snapshot, always nil after de-serialization, and hash duplicates mother asset class's hash.
- mother manages assets *and* their lifetime, organized in a specialized fcl-stl map addressed by hashes. Assets are reference-counted, all refcounts are reset to zero after the logic unloads.
- most assets' actualization is handled by the mother in the render phase, employing background tasks if necessary.
- mother owns background threads and can run background tasks, including cpu-side animation.
- the loading screen with its fancy progress indicator was dropped in its entirety. The logic remains frozen until first successful render but keeps sending render jobs. The render jobs fail with un-actualized assets, causing some assets to actualize each frame, and replacing themselves with a console render job. So "loading screen" is the console with, maybe, a low-res background image.
- the error recovery screens were dropped, application displays console with BSOD background and "Press Esc or Back to exit".
- Kakan manages jobs opaquely to the logic. It sorts jobs by render targets automatically, calculating their order based on where that texture is a texture and where it is a target.
- Depth/stencil are managed by Kakan opaquely, targets could only be textures. Reason: targeting Mali 400 as the minimum, so depth/stencil do not actually exist and cannot be reused with another color attachment. Need a depth pass? Stuff its output into a RGBA8. Preferably in 128x72 resolution.


The design document for my first planned game has no English translation yet, also my websites are down due to unsuccessful hardware upgrade (the venerated SATA controller dated 2006 finally gave up the ghost, sees my Samsung HD204UI drives as "ASSMNU GDH02U4 I" glitches with random-generated capacity)


P.S. See this nightmare:
SNIPERS: A Nightmare for Developers and Players https://www.youtube.com/watch?v=lOebGm_jMLY
- and that's why my planned game does not have hitscal weapons at all.
"Sniper" will be one of ninja's load-outs, heavily influenced by the TF2 "Lucksman" (sniper's bow that fires arrow projectiles).

P.P.S. When playing competitive first person shooters, no one wants "serious". What people want is slapstick rumble. So any foolish developers who try "serious" style soon give up under players' pressure, their artsy black ops noir degenerating into slapstick comedy. Compare to the wisdom of Valve who made TF2 slapstick from the start (and also reaped immense profit on cosmetics and taunts).
So, the further away from a mil-sim, the better. More. More distance. Make spells, not weapons. Use in-universe reason for player avatars being something like shadow clones, so that they dispel or unravel with zero blood.

P.P.P.S. My solution to the problem highlighted in the video above: make the snooper rifle shoot on release, like bows in Mount & Blade. Like a mini-game. The need to lead your vic-- ahem, target is already there. Combine that with firing in the appropriate time window... Otherwise suffering outrageous penalties to accuracy. So that a zero-time instant shot goes wide most of the time and holding LMB for too long adds increasing sway.

SilverWarior
20-02-2023, 06:44 PM
P.P.P.S. My solution to the problem highlighted in the video above: make the snooper rifle shoot on release, like bows in Mount & Blade. Like a mini-game. The need to lead your vic-- ahem, target is already there. Combine that with firing in the appropriate time window... Otherwise suffering outrageous penalties to accuracy. So that a zero-time instant shot goes wide most of the time and holding LMB for too long adds increasing sway.

Not a bad video but it fails to identify where games fail to simulate rifle guns entirely. The biggest reason why rifles are not good for close quarters fighting in real life is their long barrel. Why is that?
Well that barrel has some weight. And since you are holding that barrel away from your body that acts as center of rotation it has quite a lot of inertia meaning that you need to apply quite significant force to start turning the barrel toward your target and then also an equal amount of power to stop it turning toward your target at the right time so you don't turn to much and thus go past the target. And unlike in a game where you can quickly move the mouse to quickly turn yourself for a large angle and then almost immediately stop pointing at your target you would never be able to do so in real life. Not unless you are a super strong robot.
Bare in mind that it isn't just turning left and right where you are fighting against barrel inertia it is also up and down. That is why when you look at some special forces or military personnel they always walk a bit strange by having their knees bet all the time when having their weapon raised. That is needed because during normal walk people usually sway left and right to some degree and move up and down a bit when going from left to right foot and vice versa. So during normal walk you would be constantly fighting against barrel inertia.

Another problem of long barrel is that moving with it in tight spaces is quite cumbersome. Why? Because you now have one meter long "stick" (size of my bb rifle from its shoulder support to end of the barrel) or longer sticking out from you. So you can no longer move as close to the wall without hitting the wall with the barrel.
Do you want to get better idea of how cumbersome this can be but you have no actual rifle in your house? Take a broomstick put the broom brush against your shoulder as if you are holding a rifle and go walk around your house. Broom size is pretty similar to the size of a sniper rifle.
NOTE: I'm not taking responsibility for any damage you might cause in your house during this experiment ;)

So one way of solving the problem of sniper rifle being so overpowered is making sure that bigger movement you have made more time it takes for your aim to become steady as it is in real life. And that is going to make huge difference.

But of course there is another problem. And that is simplified hit detection. In most games you are basically detecting just body shot (causes same damage whether it is in the chest or a and finger) and head shot (instant kill in most games). But in real life this is much more complex. For instance if you get shot into vital organ you are pretty much goner even if it is just from a small pistol. But you can actually get shot with a sniper rifle into non vital part of your body and live even if sniper file would have just went through you.
So why do games treat sniper rifles as one shot kills then. That is because throughout history most snipers were also expert marksman who knew which body part to hit in order to be effective. So statistically their shot to kill ratio was very high. But it wasn't as much due to sniper rifle but due to their expert marksmanship.

Chebmaster
21-02-2023, 02:55 PM
That broom is quite enlightening.
Yes, using long rifles and swinging zweihanders in tight passages... That's arcade, not sim. Who *ever* implements inability to turn around because your shillelagh is longer than the corridor's width...
AFAIR only Tribes: Vengeance even had a mechanic that visibly moved your gun back if you faced a wall (and also called their rocket launcher "spinfusor" which is seriously badass).

At the very least, firearms could be balanced along movement vs accuracy axis. If you are on the move or change your aim rapidly, you get atrocious random spread (which shotguns and smgs partly negate by having their own spread). If you want an accurate shot you switch to aiming stance, either by stopping and reducing your mouse movements, or by pressing a dedicated button that hampers your movement and zooms.
If a game doesn't have that, it's an arcade and should look long and hard at the Quake series.

Hmm... Maybe i should review my concept. Not forcing movement penalty while spell is being charged, but inflicting large random sway instead (& hiding the crosshair). Then "Charging your enemy while charging your shot" becomes a valid strategy. Also, directing homing projectiles while sprinting (the controllable fireball from Dark Messiah of Might and Magic is my shining ideal).

Loadout opportunities arise.
If the spell that serves the role of shotgun (120% damage total scattered in a wide cone) could be pre-charged to fire instantly on release while sprinting, and its alt-fire works like Q3 nailgun (a long-range with very little spatial but large velocity spread) penalized with a loud sound and standing still...
If the spell that mimics Q3 plasma, at the same time, has a sizable firing delay and no way to pre-charge it because its alt fire consists of controllable single shots for long-range harassing instead...
That gives depth to the rock-paper-scissors interplay between those two.

Pair with more class-specific spells, like a controllable fireball that has hefty mana cost, and you get seriously fun gameplay with very few actual "weapons".

SilverWarior
21-02-2023, 04:26 PM
AFAIR only Tribes: Vengeance even had a mechanic that visibly moved your gun back if you faced a wall

Actually there are several games that have this mechanic. If my memory serves me correctly both Crysis and Far Cry 3 have this mechanic.

Jonax
21-02-2023, 07:37 PM
Actually I never play that type of games anymore. Last game where I was running around shooting uglies was the great adventure game 'Legacy' from I think 1993. It ran well on brave 1 MB RAM and 386 processor and VGA monitor. In fact I still got the game on Dosbox. Though I don't think there were any sniper rifles in that game.

Point is I can say as often. I'm happy to see activity in the Pascal crowd but I can't really comment much on the current topic. Sniper rifle and its properties.

Chebmaster
22-03-2023, 10:35 AM
Still *deep* in rehauling the very foundations.

Who could have thought that browsing Wikipedia about supercontinental cycles could give you ideas!

My former Logic, bloated to unsustainability and stifled by being the root managed object of the graph, split apart like Pangaea -- and things are becoming so, so much simpler!
Each of the resulting entities is quite manageable, I am in process of stuffing them full of methods scavenged from my old TAbstractLogic and organizing their interactions.
Also, the root managed object of the graph that goes into a sav, is now a transient thing, created just before serialization and disposed of after deserialization. Thus decoupling save file structure from the actual data structure.

I would never get anywhere with layered lag compensation had i not made this split.

Will also help me nicely to separate GUI (a local client entity, not existent on a dedicated server) from the game world.
I am positive I could present a lag-compensated multi-player rotating cube this autumn.

About first person shooters: with the exception of occasional delves into Brutal Doom, I prefer team multiplayer games of a run-and-gun variety. Namely, Jagex Ace of Spades (before it went down) and TF2. Unlike the mindless npc slaying of single-player shooters, those are tactical struggles against fellow humans, your equals in cunning, and working with your team to achieve set goals (usually capturing/holding control points, capture the flag or defense against the other team dragging a bomb towards your base).

When I finally release my design document for my planned game, you will see it's basically an AoS clone with ideas borrowed from TF2 and some of my own.
When I initially laid foundations for my engine, I wanted to make a 4X game -- maybe that, too, in time. Too ambitious, just like me struggling for years trying to one-up Unreal Engine instead of making a game.

TL; DR: snipers are anti-thesis to run-and-gun. Like in Open Arena: you have a fun rocket duel, then comes some killjoy with a railgun. Not on my watch. All my planned weapons are projectile-based.

Chebmaster
24-04-2023, 10:56 AM
Google translate, I call upon you to let me bridge the language gap for free!
(from https://freepascal-ru.translate.goog/forum/viewtopic.php?f=10&t=10058&sid=5ebd00fc2fc9ebbd0d8d73f2588dbe14&start=870&_x_tr_sl=ru&_x_tr_tl=en&_x_tr_hl=ru&_x_tr_pto=wapp&_x_tr_sch=http )

(my reply to discussion about reproducibility and how to achieve it)

Re: Cheb's Game Engine

Message Cheb » 02.03.2023 15:10:10
The trick is to:
a) strictly 32-bit floats.
b) you wrap *any* constant in the code in a typecast to a float. Any. Anytime and anywhere. a:= b * Single(2.0); Otherwise, Pascal tries to calculating in as wide format as possible and does it in a platform-dependent way: doubles, extendeds, black magic ...

Added after 3 hours 54 minutes 43 seconds:
PS. I do not take anything for granted, I experiment, I have a built-in tester in the engine that calculates md5 over the entire 32-bit range (4 billion in total).
Damn, that's when it's inconvenient that the engine is not going to at all.
AFAIR, I compared x86, x86-64 and arm from raspberries - and everywhere the sine converged to a bit.

Added after 1 minute 16 seconds:
P.P.S. BUT! then I collected in 2.6.4 for x86-64 and, AFAIR, 2.6.4 also for arm.

Added after 5 hours 37 minutes 26 seconds:
P.P.P.S. I started a separate test program consisting of a single source file, ripped from the engine - but when would it be ready I really dunno, there is no time at all, a lot of things from all sides.

User avatar
Cheb
enthusiast

Messages: 985
Registered: 06/06/2005 15:54:34

to come back to the beginning
Re: Cheb's Game Engine

Message Cheb » 04.03.2023 15:44:36
Oh, how many wonderful discoveries we have! :shock: :x :evil:

(note: if you looked at the indicator of your processor in the Intel Burn Test / Lintel and dreamed - prepare for dashed expectations. On a processor with a limit of 20 gigaflops, the Pascal program will give out around 0.8. Because there are spherical cows coded in the most exalted AVX by special people - and then there are one-at-a-time calculations with guaranteed bitwise reproducibility)

1. Frac () is a monstrously slow function. Lowest of the low at the Sin() level. If you were hoping to make an accelerated fake sine like

Code: Select all

function ebd_sin(a: float): float; inline;
begin
a:= frac(a * float(0.318309886183790671537767526745031));// 1 / 3.141592653589793));
a:= (float(1.0) - a) * a;
Result:= float (129600.0) * a / (float(40500.0) - a);
end;

- forget it, it will wallow in the same ditch with the sine and they will be oinking head to head (sin() 0.04 gigaflops, ebd_sin() 0.05).
Which is 13 times slower than multiplication and one and a half times slower than 1/sqrt(x).

2. In 64-bit code, some things are much slower, and some things are much faster - but the reproducibility is ideal. Checksums always match those from the 32-bit code. In order to get a mismatch, you need to climb into the assemblly language and stick your fingers in the electric socket of RSQRTPS (quick and dirty inverse square root). That one - yes, that one will have a different checksum on each CPU model, not just compile target.

AFAIR, on the Cortex A7, the checksums were exactly the same - although it would seem. I can't check right now, all my raspberries and oranges are gathering dust on the shelf. And even more so, I can’t check arm 64: I simply don’t have such. I bought an orange last year - I even was wondering why was it so cheap. It turned out that inside there is the same Cortex A7 in an embrace with Mali 400. That is: Orange Pi PC is a Chinese analogue of Raspberry Pi 2B, not higher. And it's still is being sold!

Anyway, on x86-64 (compared to x86):
- Frac() got exactly three times faster, making ebd_sin() outperform Sin() by 3.4 times - because that function slowed *even more*, down to 0.035 gigaflops. Do they have a special competition or wut?
- multiplication by a constant not wrapped in a typecast to float slowed down by 2.78 times compared to wrapped one. Moreover, the checksums of that of the other option match with their counterparts from the 32-bit code (and they are different from each other).

More details (including the test source) - when I fix my server and there will be somewhere to post it.

Added after 21 hours 10 minutes 8 seconds:
Furthering the topic of speed: SQRTPS + DIVPS with 1.0s preloaded into the registers are *exactly* four times faster than the standard 1/ sqrt(x). Obviously, the compiler uses exactly the same instructions - only scalar, not vector. Doing four operations at a time accelerates calculations by exactly four times. I have RCPPS commented out there - obviously, the checksum did not match, bitwise it turned out differently than honest 1 / x through DIVPS.

But just look at RSQRTPS going at it! (four and a half times faster than the reproducible sse and eighteen times faster than the regular 1/ sqrt (x)) - and it becomes obvious that this is not a bad compiler, this is a processor getting lost in thought when you require bitwise conformance to standards.

..checking 1/sqrt(x)
..................................
ok, in 45 (pure 21.2) seconds (0.1 GFLOPS)
..md5 checksum = 7BA70F1439D5E2955151CC565477E924

..checking SSE SIMD4 1/sqrt(x)
...................... ...........
..ok, in 29 (pure 5.31) seconds (0.401 GFLOPS)
..md5 checksum = 7BA70F1439D5E2955151CC565477E924

..checking SSE SIMD4 RSQRTPS (packed quick reverse square root)
... ..............................
..ok, in 25 (pure 1.18 ) seconds (1.81 GFLOPS)
. .md5 checksum = F881C03FB2C6F5BBDFF57AE5532CFFFD


Let me remind you, this is on a CPU for which Lintel reports 20 gigaflops per core (and 30 for two, because both do not fit into TDP at full tilt making effectively a 1.5 core CPU).

Added after 3 minutes 45 seconds:

Code: Select all

dck_one_div_sqrt: begin
for m:= 0 to (mm div 8) - 1 do begin
pointer(pv):= p + m * 8 * sizeof(float);
pv[0]:= 1/sqrt(pv[0]);
pv[1]:= 1/sqrt(pv[1]);
pv[2]:= 1/sqrt(pv[2]);
pv[3]:= 1/sqrt(pv[3]);
pv[4]:= 1/sqrt(pv[4]);
pv[5]:= 1/sqrt(pv[5]);
pv[6]:= 1/sqrt(pv[6]);
pv[7]:= 1/sqrt(pv[7]);
end;
end;
{$if defined(cpu386)}
dck_sse_one_div_sqrt: begin
for m:= 0 to (mm div 8) - 1 do begin
pointer(pv):= p + m * 8 * sizeof(float);
asm
mov eax, [fourones]
MOVAPS xmm5, [eax]
mov eax, [pv]
MOVAPS xmm6, [eax]
SQRTPS xmm6, xmm6
MOVAPS xmm4, xmm5
DIVPS xmm4, xmm6 //RCPPS xmm6, xmm6 //Reciprocal Parallel Scalars or, simply speaking, 1.0/x
MOVAPS xmm7, [eax + 16]
SQRTPS xmm7, xmm7
MOVAPS [eax], xmm4
DIVPS xmm5, xmm7 //RCPSS xmm7, xmm7
MOVAPS [eax + 16], xmm5
end['eax', 'xmm6', 'xmm7', 'xmm4', 'xmm5'];
end;
end;
dck_sse_rsqrtps: begin
for m:= 0 to (mm div 8) - 1 do begin
pointer(pv):= p + m * 8 * sizeof(float);
asm
mov eax, [pv]
MOVAPS xmm6, [eax]
RSQRTPS xmm6, xmm6
MOVAPS xmm7, [eax + 16]
RSQRTPS xmm7, xmm7
MOVAPS [eax], xmm6
MOVAPS [eax + 16], xmm7
end['eax', 'xmm6', 'xmm7'];
end;
end;
{$endif}


, where mm in most cases = 2048

User avatar
Cheb
enthusiast

Messages: 985
Registered: 06/06/2005 15:54:34

to come back to the beginning
Re: Cheb's Game Engine

Message Cheb » 10.03.2023 22:53:15
Updated requirements, cleaned definitions in the code from unnecessary variability

Reason: my minimums include Athlon 64 X2 (2005, alas, I don't have it) and Pentium E2140 (2007, computer named Gray Goose). Both of these dual-core processors are 64-bit (alas, WinXP has no usable 64-bit version) and support SSE3.
Then what the (insert expletive here) was I doing basing my code on SSE2 instead of SSE3?
From now on, any code for x86 and x86-64, in any assembler inserts, assumes that SSE3's availability is guaranteed.

I am not going to consider SSE4 and higher, because if the E2140 with its two 1.6 GHz cores has enough horse power, then any modern one would fly into orbit and there is simply no point in working myself hard about this. My good intentions towards AVX/AVX512 will likely remain intentions.
That's it, all done..

Further, for LinuxSBC I have those minimals: Cortex A7. It has VFPv4-16, and I declare the same in my code as the only supported option - if I ever get to assembler under arm.
All arrived.

TL; DR: Free Pascal is optimized for *reproducibility*, bitwise matching results on all platforms. It seems it sacrifices lots of performance to reach that goal.

Jonax
24-04-2023, 02:43 PM
Thanks for sharing :). Although I didn't grasp the details, despite Google's effort to bridge the gap, I think the conclusion is reasonable.

Chebmaster
24-04-2023, 09:52 PM
Argh. [headdesk] Argh.
Corrected google's translation by hand. In so many places it raises questions: why even bother using it. It's much better than 10 years ago but there are still so many things it fails to understand and convey.
Whom am I kidding. Correcting is so much easier than translating 100% by myself.

Reproducibility is important for me, since my multiplayer model will be an evolved lockstep:

TLayerRole = (
lro_Bottom, {
In multiplayer, runs at -500ms using perfect inputs finalized by the server.
This is also the only layer that can be serialized. }
lro_DeepUpwell, {
Propagates changes from the bottom to the thermocline, thus lazily correcting for late inputs }
lro_Thermocline, {
Holds steady at -150ms, assuming most inputs arrive *above* it }
lro_FastSurfacing, {
Bubbles the changes from the thermocline to the surface thus doing the bulk of lag compensation }
lro_PresentSurface {
Runs on local player inputs }
);
As always, lots of stuff distracting me (mainly work at work) leaving me no time to move the project further. Frustrating.

SilverWarior
24-04-2023, 10:16 PM
Although I didn't grasp the details, despite Google's effort to bridge the gap

Don't be hard on yourself if you don't understand all the details. Google translate seems to have done quite a good job. But the topic that Chebmaster is talking about is very complex.

He is talking about hardware-level optimization and making use of extended CPU features for accelerating specific processing. This is a very complex stuff especially if you take into account that some of these extended features might be vendor specific (proprietorially owned by Intel or AMD). This means that if you want to make use of some Intel proprietary feature on AMD CPU or vice versa, the specific feature might not be directly supported by that CPU so a fallback methods which is usually slower is used to at least get the desired results. Otherwise such code would simply break.

Another important thing is to make sure that you feed the CPU with data in the correct format that is required by the specific extended feature. Failing to do so could also result in CPU resulting in the use of some fallback methods and thus hurting performance.

Jonax
30-04-2023, 07:34 PM
Don't be hard on yourself if you don't understand all the details. Google translate seems to have done quite a good job. But the topic that Chebmaster is talking about is very complex..


Yeah, my problem is not the quality of the translation. Which I can't comment on other than the sentences seems to have good spelling and structure.


It's good to see some high tech acitivity in the pascal game making field. I, on the other hand, still try to familiarize myself with the basics. There is still a lot of unexplored possibilities for me in the world of 2D standard pascal components. Though I admit the audience potentially interested in my stuff is pretty limited.

SilverWarior
01-05-2023, 12:28 PM
Though I admit the audience potentially interested in my stuff is pretty limited.

Well main reason why not many people are interested in your games is because you can find similar games all over the internet in WEB format. So many people may think in a way: Why would I go and download his game if I can find same or similar game on one of those online-games web page.

But don't put to much thinking into this. We all have to start somewhere. At least you are finishing and publishing some games.
Me on the other hand have been probably learning game development for far longer (over 15 years now) but since I'm always aiming for to big if ideas I still haven't published any game so far. It is not that I would not have any ideas or knowledge. I have to many ideas but still not enough knowledge to make one of my big ideas into reality.

Jonax
01-05-2023, 10:30 PM
Indeed an interesting discussion. It's quite a challenge to reach and please an audience. However I'm afraid we're close to hijacking the Chebmaster Cheb's project thread. Sorry Cheb
;D


How about starting a new thread somewhere with some general how-to-become-a-successful-game-creator theme? Maybe the last few posts could be a good starting point.
:)

Chebmaster
05-05-2023, 09:57 AM
I, definitely, want to push Free Pascal to its limits and achieve the impossible.

Here's the determinism check as a standalone project
(note you need to make sure your browser doesn't correct http into https since I still haven't corrected my server's Let'sEncrypt and the https has invalid sertificate)
pure source http://chentrah.chebmaster.com/downloads/determchk.zip (7Kb)
with binaries compiled for x86 and x86-64 using both Free Pascal 3.2.2 and Free Pascal 2.6.4 : http://chentrah.chebmaster.com/downloads/determchk_withbinaries.zip (199Kb)

As you can see, the lion's share of processing time goes to calculating those md5 sums.

A reminder: determinism is required for my planned multiplayer code to work at all. If the checksums do not match between platforms, those platforms wouldn't be able to play together and you'd need a separate server for each of them.

My friend who is working in in the game industry full time, had to deal with lack of determinism in Unity. Namely, you cannot count on monsters behaving identically if present with identical player actions. He had to improvise, adding a distributed server of sorts where each of the clients in a multiplayer game acted as a server for a fraction of monsters and just broadcast the behavior of those monsters to all other clients.

Full determinism, on the other hand, allows sending *only* the player inputs over the network. This is MMO-grade stuff: no matter how many monsters are there (even a million) or how massive the changes to the game world (i want the ability to reduce the whole map to a huge crater) the network traffic would remain zilch.

SilverWarior
05-05-2023, 01:50 PM
Have you perhaps considered using some other Hashing algorithm instead of MD5. CRC32 hashing algorithm is way faster but might result in more clashes where different input results in same hash result. On the other hand many modern CPU's have hardware support for SHA based hashing algorithms which could mean that they would be much faster than MD5 which if my memory serves me correctly is rarely hardware accelerated.

Any way there is a good thread on Stack Overflow about comparison between various hashing algorithms. https://stackoverflow.com/questions/10070293/choosing-a-hash-function-for-best-performance
Granted question poster was interested in performance difference in .NET environment but some people that provided answered have done their own testing in other programming languages even Delphi.

Chebmaster
08-05-2023, 09:29 AM
I just grabbed the one that was easiest to slap on and had a reasonably sized hash.
Since this code is not going to be part of normal execution but only be used for research during development (or, maybe, as an optional "check your CPU for compatibility" feature).

Chebmaster
19-05-2023, 03:34 PM
I'm more and more tempted by the idea of 16-bit integer physics. 32768 is actually a lot, if you use it right. I have experience, after all - that game for MS-DOS used 16-bit physics.
I also learned a lot since, the problem of velocity discretization at low speeds is easily circumvented by defining speed not per tic but per interval of N tics, where slow objects would move slowly, one jump per hundreds of tics (and just interpolated by any object interacting with them).
SSE offer unique possibilities of speeding things up, PMULHW (https://www.felixcloutier.com/x86/pmulhw) is tailor made for such things, multiplying 8 numbers per tact in the basic version and up to 32 in its AVX512 incarnation.
Also, sines, cosines and reverse square roots -- all of these could be made using lookup tables with linear interpolation, maybe normalized using BSR - but anyway much faster than any floating-point counterparts.

Jonax
20-05-2023, 06:59 AM
I'm more and more tempted by the idea of 16-bit integer physics. 32768 is actually a lot, if you use it right. I have experience, after all - that game for MS-DOS used 16-bit physics.
I also learned a lot since, the problem of velocity discretization at low speeds is easily circumvented by defining speed not per tic but per interval of N tics, where slow objects would move slowly, one jump per hundreds of tics (and just interpolated by any object interacting with them).
SSE offer unique possibilities of speeding things up, PMULHW (https://www.felixcloutier.com/x86/pmulhw) is tailor made for such things, multiplying 8 numbers per tact in the basic version and up to 32 in its AVX512 incarnation.
Also, sines, cosines and reverse square roots -- all of these could be made using lookup tables with linear interpolation, maybe normalized using BSR - but anyway much faster than any floating-point counterparts.

Integer physics, such an interesting idea :). Haven't tried that but I hear it was common for early programmers. I on the other hand rely heavily on the square root for moving things and calculating positions. Haven't considered using a table. It's so easy to just use the square root command. For my modest needs that's mostly fast enough.

I wasn't even aware of the SSE/AVX thingie. Is it some fancy vector calculation hardcoded in the silicon?
It seems my oldest still bootable PC (J1900) lacks SSE but the never machines got SSE(4.2). No mention of AVX.

For moving things slowly I too let them advance on the appropriate intervals. For what it's worth I once made a game on 16 bit Delphi where I also let the moving object become pale/fuzzy when moving really fast. Workes decently on my rather simple 2D games, I think.

Thanks for updating us on your progress, though most of it is beyond me. :)

SilverWarior
20-05-2023, 01:03 PM
I'm more and more tempted by the idea of 16-bit integer physics. 32768 is actually a lot, if you use it right. I have experience, after all - that game for MS-DOS used 16-bit physics.

Inf you go for Integer based algorithmic I recommend you stick to 32 bit Integers since most modern CPU's work either with 32 bit or 64 bit registers when doing math. This way you avoid converting from 16 to 32 bit and back all the time. Not to mention that integer overflow Flags will work as they should. Not sure if they would work on modern CPU's when using 16 bit integers unless you mess with FPU parameters which could lead to host of other problems since on modern computers no application gets exclusive access to specific core. Therefore changing FPU parameters might affect other applications.

Any way many games actually rely on Integer based physics. Some even using 64 bit integers to achieve high enough precision. And there are whole libraries for doing Integer based math that you can find on internet.

Another big advantage of using integer math is that if you are serializing and deserialzing your data to to some text based data structures like XML, JSON you can be sure that the value that is stored in such data structure is teh same that was stored in memory.
When working with floating points this can not be guaranteed since you are changing from floating point to decimal system first. And not every floating point value can be converted into exact decimal system value or vice versa.

Chebmaster
21-05-2023, 02:02 PM
Inf you go for Integer based algorithmic I recommend you stick to 32 bit Integers since most modern CPU's work either with 32 bit or 64 bit registers when doing math. This way you avoid converting from 16 to 32 bit and back all the time[...]

As far as my research shows, modern CPUs are well equipped to operate on 16 and 8 bit integers natively, no need to touch FPU. Expanding and shrinking is also very well supported in hardware. (Clear 32-bit registers, load 16-bit values into their lower parts, multiply as 32-bit, shift right by 16 bits, store result from the 16-bit lower part).
x86-64, for example, allows addressing registers like r8w..r15w , meaning 16-bit parts of its extra 64-bit registers r8..r15 - not to mention the old trusty ax, bx, cx, dx, si and di of x86 aren't going anywhere.

Serialization is not a problem: I use a library I made back in 2006..2008 that serializes into a binary format (achieved 1 million instances per second back then, on a Dual Core slowed to 1 GHz via lowering multiplier in BIOS, with DDR2-400 RAM)
Normal calculations aren't a problem either: Free Pascal is all about reproducibility, as my tests proved. It's vector normalization and trigonometric things like sincos that slow everything horribly.
Not even physics, but animation - which I want to be part of physics. Which operates on lots and lots of bones.
Also the fact my skinning is going to be calculated on CPU only, for ease of coding and better compatibility. 16-bit numbers are twice as fast in memory bound bottlenecks (too easy to hit)


Integer physics, such an interesting idea :). Haven't tried that but I hear it was common for early programmers. I on the other hand rely heavily on the square root for moving things and calculating positions. Haven't considered using a table. It's so easy to just use the square root command. For my modest needs that's mostly fast enough.
I wasn't even aware of the SSE/AVX thingie. Is it some fancy vector calculation hardcoded in the silicon?

Mind the cache, you don't want that table to be larger than one kilobyte or so, only fits 512 16-bit values - thus, interpolation (and BSR trickery for square root). But even like that, a table would *shred* honest functions speed-wise if you need sin or cos.

SSE3 (which i had declared my minimum system requirements) provides sixteen 128-bit registers for vector floating point and integer calculations.
Its support dates back to single-core ancients, Athlon x64 (launched in 2003) and Pentium IV Prescott (launched in 2004). By 2007, the year I am aiming at hardware-wise, it was old and tried technology.

SSE can also act like MMX on steroids, operating on 16-bit and 8-bit numbers. The PMULHW command in particular, lets you multiply 16-bit signed integers like they are 8.8 fixed-point: it shifts the 32-bit result right by 16 only leaving the highest 16 bits. On a vector of 8 16-bit numbers stuffed into a 128-bit register.

AVX is very old stuff as well, my laptop dated 2012 has it (i5 2540m), extends the XMM registers to 256-bit YMM registers. Intel provides a code sample, I believe, that allows batch normalizing of vectors at a rate of one vector per tact.
AVX2 only adds more commands, as far as I know, while AVX512... Make a guess :p

Free Pascal, as far as I could tell, only supports AVX so far (it was long time since I last tested).

But if boggles the mind, how much raw muscle even a Core2 Duo has.

Chebmaster
21-05-2023, 03:32 PM
Correction: it seems i mixed things up. SSE has 8 registers, not 16 - that is a latter extension. 8 is still a lot.

Chebmaster
01-10-2023, 07:05 PM
After a BIG sidetrack to finishing up my favorite author's DooM mod without his permission (beware of a rabid fanboy and all that) -- a work that took literally months, since June to September,
I am beginning curve back to my own projects.
Remembering what was I doing took some effort, even switching back to Pascal from the horrible twisted hacks of ACS and DECORATE scripting.

I have finished (in theory, mind you: my code still does not compile) several "required secondary powers" without which the basics of layered architecture were not possible.
Did you know that <= and >= operators require their own versions when you do operator overloading? Not a surprise, really, after thinking about it.

Then, after those parts of the foundation were maybe-ready, it was time to think with pen and paper in hand.
Because my habit of keeping everything in my head would have made it explode in this case.

Now, with this sketch in hand, I can plan finer details and class relationships.
There are several consequences:

1. There should be support for several completely self-contained worlds. In practice, two: the lobby and the map. So that the lobby does not reset on map change, for example. But I can also make each team's dressing room a separate universe -- OR make the lobby itself the dressing rooms, separated by a glass wall for taunting.
2. UI cannot just affect the worlds willy-nilly, it must be linked to an agent -- let's call it "player character". Even if it is a dumb spectator camera, or even a placeholder waiting for the player to teleport from the lobby to the map. All effects are called "player inputs" and must go through the multiplayer manager, which gathers all inputs from all players to drive a 100% reproducible layered world. All such inputs must be as lightweight as possible since they are serialized and transmitted over network.
3. When UI reads the world, it reads the "Present surface" layer, which is the cherry on top of the lag compensation. So UI must be prepared for radical changes to the world and the player character between frames, since full-world lag compensation could introduce or cancel results of, say, player nuking half the map.
4. UI cannot reliably link (for monitoring it) to any object except the local player's character and objects created in the base layer. Because, due to lag compensation, objects created in intermediate layers are transient and are replaced by "the same" object from the lower layer when that layer bubbles up -- actually, a separate object unrelated to the previous one.

Jonax
25-01-2024, 10:37 AM
Thinking with pen and paper is a good thing. I try that too often. Though my doodles are not as well organized as yours ;D

Chebmaster
29-01-2024, 03:39 PM
Well, some things are hellishly hard to develop without drawing signal diagrams on a millimeter grid paper -- namely, thread syncing algorithms. Luckily for me, I still have that *huge* roll of it, probably ten meters or so by one meter. It is quite yellowed and prone to cracking, tho... How long ago did I buy that thing? Or was it bought by my *parents* for my school activity in the 1980s...?

I wasn't going to necro this thread for the time being (until I have at least *some* progress to show), but since you did that anyway, I can say this:
I am not going to achieve progress in the near future: work projects at work demand my attention.
All the free time I had the last year went into this: https://www.doomworld.com/forum/topic/137366-chebskies-v104-a-fix-up-mod-for-extermination-day-beta-001/
On the plus side, I've finalized the architecture of my engine, the only thing left is coding and coding and coding it into reality.

I have nearly created another Lovecraftian monstrosity in this "carousel of per-tic graveyards". Luckily I ate vitamins, came to my senses and realized that it is enough to have per-layer graveyards and make any layer use the *upper* layer's graveyard.
The *sole* reason to keep instances after the current tic's end is the links to those instances that could be leading from the upper layer via accelerated fields (another invention of mine, managed by the memory manager). But, when the upper layer floats to the very top and vanishes, that's when those stop being a concern. So, the natural decision was to just use that layer's graveyard. When layer vanishes, all the instances vanish -- without calling destructors or even cleaning the graveyard, because I upgraded my memory manager for using different memory pools for different layers. Erasing a layer is as simple as dropping the pools associated with it.

My second big decision was making the architecture a bit more complex, which will include some painful juggling and having several "worlds" in the base layer, each of which could be at a different tic and could be downloaded from the server independently. My main reason was a dream of having a lobby (with the chat being its part) that does not disappear on map transition.
It is an *old* annoyance when you want to type something profound like "Lol ez wusses" but the map changes and your words of wisdom are lost.
So if the lobby (thing TF2's dressing room) and the actual map are two universes inside the same server the client connects to independently, then, since the lobby is featherweight and could be connected to nigh instantly:
1. You start choosing your class and cosmetics, seeing other players and the in-game chat/voice communication right away, before the engine finishes the "download the snapshot + fast-forward the snapshot to the actual tic" combo on the main map.
2. Reconnecting after losing connection would feel much less frustrating if you could spend most of that time in the lobby, aware what is happening.