1024programmer Java FillChar and StringOfChar under Delphi10.2 for Win64ReleaseTarget

FillChar and StringOfChar under Delphi10.2 for Win64ReleaseTarget

I have questions about a specific programming issue in the Delphi 10.2 Pascal programming language.

StringOfChar and FillChar do not work properly under Win64 Release versions on CPUs released before 2012.

The expected result of FillChar is simply to repeat a simple sequence of 8-bit characters in the given memory buffer.

The expected result of StringOfChar is the same, but the result is stored in the string type.

But actually, when I compiled our application running in pre-10.2 Delphi, our application compiled for Win64 stopped working properly on CPUs released before 2012.

p>

StringOfChar and FillChar do not work correctly – they return a string of different characters, albeit a repeating pattern – not just a sequence of the same characters as they should.

This is the smallest code that is enough to demonstrate the problem. Please note that the length of the sequence should be at least 16 characters, and the characters should not be nul (#0). The code is as follows:

procedure TestStringOfChar;
 var
   a: AnsiString;
   ac: AnsiChar;
 begin
   ac := #1;
   a := StringOfChar(ac, 43);
   if a  #1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1  #1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1#1 then
   begin
     raise Exception.Create('ANSI StringOfChar Failed!!');
   end;
 end;
 

I know there are a lot of Delphi programmers on StackOverflow. Have you encountered the same problem? If yes, how do you solve it? What’s the solution? BTW, I’ve contacted the developers of Delphi but so far they have neither confirmed nor denied the issue. I’m using Embarcadero Delphi 10.2 version 25.0.26309.314.

Update:

If your CPU was manufactured in 2012 or later, also include the following line before calling StringOfChar to reproduce the issue:

const
   ERMSBBit = 1 shl 9; //$0200
 begin
   CPUIDTable[7].EBX := CPUIDTable[7].EBX and not ERMSBBit;
 

As for the April 2017 hotfix for the RAD Studio 10.2 toolchain issue – tried it without it – it didn’t help. Regardless of the hotfix, the problem persists.

1> Johan – rein..:


StringOfChar(A: AnsiChar, count) uses FillChar under the hood.

You can use the following code to resolve this issue:

(******************************************  *****************
  System.FastSystem
  A fast drop-in addition to speed up function in system.pas
  It should compile and run in XE2 and beyond.
  Alpha version 0.5, fully tested in Win64
  (c) Copyright 2016 J. Bontes
    This Source Code Form is subject to the terms of the
    Mozilla Public License, v. 2.0.
    If a copy of the MPL was not distributed with this file,
    You can obtain one at http://mozilla.org/MPL/2.0/.
 ***************************************************  ****
 FillChar code is an altered version FillCharsse2 SynCommons.pas
 which is part of Synopse framework by Arnaud Bouchez
 ***************************************************  ****
 Changelog
 0.5 Initial version:
 ***************************************************  ******)

 unit FastSystem;

 interface

 procedure FillChar(var Dest; Count: NativeInt; Value: ansichar); inline; overload;
 procedure FillChar(var Dest; Count: NativeInt; Value: Byte); overload;
 procedure FillMemory(Destination: Pointer; Length: NativeUInt; Fill: Byte); inline;
 {$EXTERNALSYM FillMemory}
 procedure ZeroMemory(Destination: Pointer; Length: NativeUInt); inline;
 {$EXTERNALSYM ZeroMemory}

 implementation

 procedure FillChar(var Dest; Count: NativeInt; Value: ansichar); inline; overload;
 begin
   FillChar(Dest, Count, byte(Value));
 end;

 procedure FillMemory(Destination: Pointer; Length: NativeUInt; Fill: Byte);
 begin
   FillChar(Destination^, Length, Fill);
 end;

 procedure ZeroMemory(Destination: Pointer; Length: NativeUInt); inline;
 begin
   FillChar(Destination^, Length, 0);
 end;

 //This code is 3x faster than System.FillChar on x64.

 {$ifdef CPUX64}
 procedure FillChar(var Dest; Count: NativeInt; Value: Byte);
 //rcx = dest
 //rdx=count
 //r8b=value
 asm
               .noframe
               .align 16
               movzx r8,r8b //There's no need to optimize for count  512kb
               jnz @InitFillHuge
 @Doloop64: add rcx,r8
               dec edx
               mov [rcx-64+00H],rax
               mov [rcx-64+08H],rax
               mov [rcx-64+10H],rax
               mov [rcx-64+18H],rax
               mov [rcx-64+20H],rax
               mov [rcx-64+28H],rax
               mov [rcx-64+30H],rax
               mov [rcx-64+38H],rax
               jnz @DoLoop64
 @done: rep ret
               //db $66,$66,$0f,$1f,$44,$00,$00 //nop7
 @partial: mov [rcx-64+08H],rax
               mov [rcx-64+10H],rax
               mov [rcx-64+18H],rax
               mov [rcx-64+20H],rax
               mov [rcx-64+28H],rax
               mov [rcx-64+30H],rax
               mov [rcx-64+38H],rax
               jge @Initloop64 //are we done with all loops?
               rep ret
               db $0F,$1F,$40,$00
 @InitFillHuge:
 @FillHuge: add rcx,r8
               dec rdx
               db $48,$0F,$C3,$41,$C0 // movnti [rcx-64+00H],rax
               db $48,$0F,$C3,$41,$C8 // movnti [rcx-64+08H],rax
               db $48,$0F,$C3,$41,$D0 // movnti [rcx-64+10H],rax
               db $48,$0F,$C3,$41,$D8 // movnti [rcx-64+18H],rax
               db $48,$0F,$C3,$41,$E0 // movnti [rcx-64+20H],rax
               db $48,$0F,$C3,$41,$E8 // movnti [rcx-64+28H],rax
               db $48,$0F,$C3,$41,$F0 // movnti [rcx-64+30H],rax
               db $48,$0F,$C3,$41,$F8 // movnti [rcx-64+38H],rax
               jnz @FillHuge
 @donefillhuge:mfence
               rep ret
               db $0F,$1F,$44,$00,$00 //db $0F,$1F,$40,$00
 @Below32: and r9d,not(3)
               jz @SizeIs3
 @FillTail: sub edx,4
               lea r10,[rip + @SmallFill + (15*4)]
               sub r10,r9
               jmp r10
 @SmallFill: rep mov [rcx+56], eax
               rep mov [rcx+52], eax
               rep mov [rcx+48], eax
               rep mov [rcx+44], eax
               rep mov [rcx+40], eax
               rep mov [rcx+36], eax
               rep mov [rcx+32], eax
               rep mov [rcx+28], eax
               rep mov [rcx+24], eax
               rep mov [rcx+20], eax
               rep mov [rcx+16], eax
               rep mov [rcx+12], eax
               rep mov [rcx+08], eax
               rep mov [rcx+04], eax
               mov [rcx],eax
 @Fallthough: mov [rcx+rdx],eax //unaligned write to fix up tail
               rep ret

 @SizeIs3: shl edx,2 //r9 <= 3 r9*4
               lea r10,[rip + @do3 + (4*3)]
               sub r10,rdx
               jmp r10
 @do3: rep mov [rcx+2],al
 @do2: mov [rcx],ax
               ret
 @do1: mov [rcx],al
               rep ret
 @do0: rep ret
 end;
 {$endif}
 

The easiest way to solve the problem is to download Mormot and include SynCommon.pas into your project. This will patch System.FillChar to the code above, and include some other performance improvements.

Please note that you don’t need all Mormots, only SynCommons.


Okay, I checked. This is faster than the new System.FillChar. About 2-3 times faster.


Thanks again to Johan who contributed your x64 asm code to SynCommons.pas!

This article is from the internet and does not represent1024programmerPosition, please indicate the source when reprinting:https://www.1024programmer.com/764991

author: admin

Previous article
Next article

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact us

181-3619-1160

Online consultation: QQ交谈

E-mail: [email protected]

Working hours: Monday to Friday, 9:00-17:30, holidays off

Follow wechat
Scan wechat and follow us

Scan wechat and follow us

Follow Weibo
Back to top
首页
微信
电话
搜索