PDA

View Full Version : UNICODESTRING vs ANSISTRING



Ñuño Martínez
12-02-2020, 07:50 PM
This is more asking for opinions than an actual question.

Working with Allegro.pas I've found a problem when compiling with Delphi: default STRING is UNICODESTRING instead of ANSISTRING. Since Allegro is written in C it expects ANSISTRING, so when compiling it returns a bunch of warnings because "implicit cast with potential data loss", where sometimes "potential" is "actual" and things don't work as expected.

I know I can deal with it using UTF8Encode and UTF8Decode/UTF8toString to convert strings, but that means an overhead that will make examples more complex and executables slower. Free Pascal STRING default is ANSISTRING so it doesn't has such problem.

A workaround would be to OVERLOAD all (or most) Allegro.pas FUNCTIONs and PROCEDUREs that deal with strings. For example, I've test next and it works:



UNIT al5nativedlg;

INTERFACE

...

FUNCTION al_show_native_message_box (
display: ALLEGRO_DISPLAYptr;
CONST title, heading, str, buttons: AL_STR;
flags: AL_INT
): AL_INT; INLINE; OVERLOAD;
FUNCTION al_show_native_message_box (
display: ALLEGRO_DISPLAYptr;
CONST title, heading, str, buttons: UNICODESTRING;
flags: AL_INT
): AL_INT; INLINE; OVERLOAD;

...

FUNCTION _al_show_native_message_box (
display: ALLEGRO_DISPLAYptr; CONST title, heading, str, buttons: AL_STR;
flags: AL_INT
): AL_INT; CDECL;
EXTERNAL ALLEGRO_NATIVE_DLG_LIB_NAME NAME 'al_show_native_message_box';

IMPLEMENTATION

FUNCTION al_show_native_message_box (
display: ALLEGRO_DISPLAYptr; CONST title, heading, str, buttons: AL_STR;
flags: AL_INT
): AL_INT;
VAR
ButtonsPtr: AL_STRptr;
BEGIN
IF buttons <> '' THEN
ButtonsPtr := AL_STRptr (buttons)
ELSE
ButtonsPtr := NIL;
al_show_native_message_box := _al_show_native_message_box (
display, AL_STRptr (Title), AL_STRptr (Heading), AL_STRptr (Str), ButtonsPtr, flags
)
END;

FUNCTION al_show_native_message_box (
display: ALLEGRO_DISPLAYptr; CONST title, heading, str, buttons: UNICODESTRING;
flags: AL_INT
): AL_INT;
BEGIN
RESULT := al_show_native_message_box (
display,
UTF8Encode (title),
UTF8Encode (heading),
UTF8Encode (str),
UTF8Encode (buttons),
flags
)
END;

END.

As I've said, it works even with Free Pascal, but the problem is that there are a lot of FUNCTIONs and PROCEDUREs that should be overloaded. Also Delphi will generate slower executables.

The other option is to add a bunch of compilation directives (as {$IFDEF DCC}...{$ENDIF}) to the examples where strings are used, but that means that some examples will not be so clean for beginners.

Also I can keep the examples as if there are no differences and add a warning to the Delphi documentation about this issue and how to fix it, but again beginners (and people don't read documentation).

So what do you think is the best option? Do you have another solution?

pitfiend
13-02-2020, 01:07 AM
I think you can define some compile time arguments. Those that controls how strings are managed.
$X+ $X- Extended Syntax, this one makes Delphi strings PChar compatible. Also allows you to use functions as procedures ignoring results
$H+ $H- Long Strings, this one turns on/off UnicodeString. Can be used locally to set strings to old Delphi behavior.
$V+ $V- This one is useful with shortstrings only as it allows you to give any sized strings as parameter when set as $V-. If you set it to $V+ then you need to pass strict string types.
This are mainly backward compatibility options. Be careful, unexpected results may happen.

Another trick you can use, is to define a local string type after detecting if Delphi or Free Pascal, and use it as you need in every parameter you pass.

Chebmaster
13-02-2020, 07:57 AM
FreePascal is also moving toward String = UnicodeString, *but* you have to enable it with {$unicodestrings
That would break backward compatibility with fpc 2.6.4, though, so you should include a check like

{$if (FPC_FULLVERSION<30000)}
{$fatal Your Free Pascal is too old! Use 3.0 or newer.}
{$endif}

My advice would be to define your own AllegroString and use it everywhere, the definition itself wrapped in conditionals for different compilers/platforms.

I did that with TFileNameString and kept working on my engine long before I made final decision what format I want for the file names. Initially I was planning TFileNameString to be Utf8String on Linux and UnicodeString on Windows but I later decided it to be Utf8String everywhere. I only had to correct the type definition and conversion functions like FileNameToUtf8/FileNameToUnicode and so on, themselves having several variants wrapped in conditionals.

P.S. Abstraction layers are good.

de_jean_7777
13-02-2020, 12:16 PM
I also do what Chebmaster does. Define my own StdString type which is currently UnicodeString, and convert in abstract platform routines. In your case your own type would be ansistring due to the Allegro library.

Ñuño Martínez
13-02-2020, 09:32 PM
Thanks for the advices.



My advice would be to define your own AllegroString and use it everywhere, the definition itself wrapped in conditionals for different compilers/platforms.


I also do what Chebmaster does. Define my own StdString type which is currently UnicodeString, and convert in abstract platform routines. In your case your own type would be ansistring due to the Allegro library.
Actually I defined two types for Allegro.pas yet: AL_STR (http://allegro-pas.sourceforge.net/docs/5.2/al5Base.html#AL_STR)which is ANSISTRING, and AL_STRptr (http://allegro-pas.sourceforge.net/docs/5.2/al5Base.html#AL_STRptr) which is PCHAR (or PANSICHAR depending the compiler). That solves part of the problem.

It is using Delphi's RTL where I have problems. For example, to draw the score on screen I may use this:


al_draw_text (http://allegro-pas.sourceforge.net/docs/5.2/al5font.html#al_draw_text) (aFont, aColor, aXpos, aYpos, 0, Format ('SCORE: %d', [aScore]));


This works perfect in FPC but shows a warning in Delphi. Note that it actually renders the text (except in a few Allegro functions) but the warning is pretty annoying. I know I can avoid it using conversion functions as I've explained above but they aren't needed by FPC (actually they'll not work!).


I think you can define some compile time arguments. Those that controls how strings are managed.
$X+ $X- Extended Syntax, this one makes Delphi strings PChar compatible. Also allows you to use functions as procedures ignoring results
$H+ $H- Long Strings, this one turns on/off UnicodeString. Can be used locally to set strings to old Delphi behavior.
$V+ $V- This one is useful with shortstrings only as it allows you to give any sized strings as parameter when set as $V-. If you set it to $V+ then you need to pass strict string types.
This are mainly backward compatibility options. Be careful, unexpected results may happen.

Another trick you can use, is to define a local string type after detecting if Delphi or Free Pascal, and use it as you need in every parameter you pass.
I didn't know about $X and $V arguments. Anyway I did some testing and I didn't find they helps.

The test I did was:

procedure TForm1.Button1Click (Sender: TObject);
VAR
lText: ANSISTRING;
begin
INC (fNum);
lText := 'Test #%d';
lText := Format (lText, [fNum]);
Memo1.Lines.Add (lText)
end;

Compiled with {$H-}, and also changing the "Long strings by default" to false, but it still shows the warning. :(

I think I should add conditional compilation in the examples (they're only a few that conficts) or write different examples for FPC and Delphi for such cases. ???

de_jean_7777
14-02-2020, 10:16 AM
I know I can avoid it using conversion functions as I've explained above but they aren't needed by FPC (actually they'll not work!).

You can add your own functions which do conversion when Delphi is used, but just return the string as is when FPC is used, via conditional compilation. They do the required thing within them, and just use them consistently instead of Delphi specific functions. Define them in a shared unit. I do this for different platforms. I don't use Delphi, but I have seen other code which does something similar in order to support both Delphi and FPC.

Ñuño Martínez
14-02-2020, 05:40 PM
I think you have something there. Thanks. I think I'll do it that way.

casanova
17-02-2020, 10:53 PM
I totally agree, adding your own functions will be a really good idea.

__________________________________________
ShowBox (https://showbox.red/) Tutuapp (https://tutuapp.win/) Mobdro (https://mobdro.onl/)

SilverWarior
17-02-2020, 11:54 PM
The test I did was:

procedure TForm1.Button1Click (Sender: TObject);
VAR
lText: ANSISTRING;
begin
INC (fNum);
lText := 'Test #%d';
lText := Format (lText, [fNum]);
Memo1.Lines.Add (lText)
end;

Compiled with {$H-}, and also changing the "Long strings by default" to false, but it still shows the warning. :(

Using your example in modern Delphi versions will always show a warning regardless of what string compiler directives you use in your code. Why? That is because the Format function is always returning default Delphi string type which is WideString, And since you are then assigning it to AnsiString you get a warning.

So you either need to create your own version of Format function that will be returning AnsiString result or disable of using of LongString as default string type at Project Options->Building->Delphi Compiler->Compiling (check Syntax options section).
http://docwiki.embarcadero.com/RADStudio/Rio/en/Compiling
Do note that this affects your entire project so if you have other code parts that are built to work with Unicode string they may stop working properly.

Ñuño Martínez
21-02-2020, 10:54 AM
Thanks for your comments.

After some working, I've reduced a lot the warnings when compiling. I'm still doing testing and changes trying to reach zero warnings (if possible).

I've also added my own Format function as SilverWarrior suggested and I think I'll add functions to convert from/to numeric values too.

Once done I'll release a new beta version so you can test it. If you're curious you can see what I'm doing here (https://sourceforge.net/p/allegro-pas/code/HEAD/tree/TRUNK/src/lib/al5strings.pas). :)

Ñuño Martínez
21-02-2020, 11:55 AM
Sorry for the double post, but I have to say this: the solution was trivial :o

While testing I discovered that function StrPas (https://www.freepascal.org/docs-html/rtl/strings/strpas.html) (used internally to force some callings to overloaded functions in my abstraction layer unit) were in a different unit in modern Delphi (AnsisString instead of SysUtils), so I added it to be used only in Delphi. Since then warnings appeared again and I was like WTF :o Delphi is trolling me... But using the Find declaration command to see the actual declaration of Format I discovered AnsiString also defines it's own version using ANSISTRING instead of UNICODESTRING!

Note it doesn't avoid the need of the abstraction layer unit, but it simplifies a lot the implementation of this unit (and avoids to call an extra function internally in other units). For example:


USES
{$IFDEF ISDELPHI2009ANDUP}
{ This unit implements sysutils using ANSISTRING instead of UNICODESTRING,
which is the default in modern Delphi compilers. }
AnsiStrings;
{$ELSE}
sysutils;
{$ENDIF}

···

FUNCTION al_str_format (CONST Fmt: AL_STR; CONST Args : ARRAY OF CONST)
: AL_STR;
BEGIN
Format (Fmt, Args)
END;

This way there's no need of conditional compilation outside Allegro.pas.

So, now we know. :)

SilverWarior
21-02-2020, 01:52 PM
Wait Delphi has a special unit just for dealing with ANSI strings? I must shamefully admit that I didn't knew that :-[

Ñuño Martínez
26-02-2020, 11:46 AM
Well, you know now.

Anyway I think they should add an option to tell the compiler to define STRING as ANSISTRING, as FPC does with {$unicodestrings} or something. That would help a lot porting and maintaining code.

pitfiend
02-03-2020, 04:16 PM
Mother of Strings!!! it's highly inefficient and error prone to have multiple, obscure and undocumented units to deal with the same task.

Ñuño Martínez
04-03-2020, 09:28 PM
If you mean the AnsiStrings unit, I agree. If I were Delphi developer I would overload the functions adding UNICODESTRING, without breaking backwards compatibility.

If you mean my al5strings... well, I didn't found a better solution.

pitfiend
11-03-2020, 03:24 AM
I meant AnsiStrings unit. That's a mistake. Your al5strings unit is a fix on a mistake made by others.