3

Its easy to define a string at the size of 3 (in old delphi code)

st:string[3];

now, we wish to move the code to ansi

st:ansiString[3];

won't work!

and for adcanced oem type

st:oemString[3]; 

same problem, where

type
  OemString = Type AnsiString(CP_OEMCP);

how could be declared a fixed length ansi string and the new oem type?

update: i know it will create a fixed length string. it is part of the design of the software to protect against mistakes, and is essential for the program.

none
  • 4,669
  • 14
  • 62
  • 102
  • 3
    Why do you think you need `AnsiString[3]`? – Cosmin Prund May 30 '11 at 13:36
  • 1
    then shalt thou insert 3 chars, no more, no less. Three shall be the number thou shalt insert chars , and the number of the chars shall be three. Four shalt thou not insert chars, neither insert chars thou two, excepting that thou then proceed to insert chars three. Five is right out. Once three chars were insert, being the third number, be reached, then lobbest thou ..., who being naughty in My sight, shall snuff it – none May 31 '11 at 13:04
  • I think Cosmin meant the "Ansi" part, not the 3. – NGLN May 31 '11 at 15:58
  • @NGLN well some times in life, there are other companys that will not push forward in technology, and you still want them to pay you for a working product , So you keep working with there old OTHer software, and interface with it with ANSIstring. not my choice. – none Jun 01 '11 at 09:16

5 Answers5

5

You don't need to define the size of an AnsiString.

The notation

string[3] 

is for short strings used by Pascal (and Delphi 1) and it is mostly kept for legacy purposes.

Short strings can be 1 to 255 bytes long. The first ("hidden") byte contains the length.

AnsiString is a pointer to a character buffer (0 terminated). It has some internal magic like reference counting. And you can safely add characters to an existing string because the compiler will handle all the nasty details for you.

UnicodeStrings are like AnsiStrings, but with unicode chars (2 bytes in this case). The default string now (Delphi 2009) maps to UnicodeString.

the type AnsiString has a construct to add a codepage (used to define the characters above 127) hence the CP_OEMCP:

OemString = Type AnsiString(CP_OEMCP);
Toon Krijthe
  • 52,876
  • 38
  • 145
  • 202
  • 1
    @none, you will introduce nasty buffer overflows if you try to recreate a pchar using a fixed length ansistring or unicodestring, don't do it. Let Delphi/c++builder worry about the length of the string. It handles all that stuff automatically and in 20 years of programming I've never felt the need to override that behaviour. – Johan May 30 '11 at 13:35
4

"Short Strings" are "Ansi" String, because there are only available for backward compatibility of pre-Delphi code.

       st: string[3];

will always create a fixed-length "short string" with the current Ansi Code Page / Char Set, since Delphi 2009.

But such short strings are NOT the same than so called AnsiString. There is not code page for short strings. As there is no reference-count for short strings.

The code page exists only for AnsiString type, which are not fixed-length, but variable-length, and reference counted, so a completely diverse type than a short string defined by string[...].

You can't just mix Short String and AnsiString type declaration, by design. Both are called 'strings' but are diverse types.

Here is the mapping of a Short String

  st[0] = length(st)
  st[1] = 1st char (if any) in st
  st[2] = 2nd char (if any) in st
  st[3] = 3rd (if any) in st

Here is the memory mapping of an AnsiString or UnicodeString type:

  st = nil   if st=''
  st = PAnsiChar if st<>''

and here is the PSt: PAnsiChar layout:

  PWord(PSt-12)^ = code page
  PWord(PSt-10)^ = reference count
  PInteger(PSt-8)^  = reference count
  PInteger(PSt-4)^  = length(st) in AnsiChar or UnicodeChar count
  PAnsiChar(PSt) / PWideChar(PSt) = Ansi or Unicode text stored in st, finished by a #0 char (AnsiChar or UnicodeChar)

So if there is some similarities between AnsiString and UnicodeString type, the short string type is totally diverse, and can't be mixed as you wished.

Arnaud Bouchez
  • 42,305
  • 3
  • 71
  • 159
  • @user What does "null terminator is not necessary" mean for you? It is always there, even if you don't need it. But it's necessary to call Windows API e.g. with a simple `PChar(aString)` expression. – Arnaud Bouchez May 30 '11 at 14:52
  • not so simple. It is typecast but emits LStrToPChar call and #0 could be inserted at that point. I;m not so sure about "always", got any whitepapers to read? – Premature Optimization May 30 '11 at 15:08
  • 1
    @user Just use `pointer(aString)` and it will returns `nil` when `aString=''` and a `PChar` pointing about `aString` text, with no `LStrToPChar / UStrToPChar`. That's why I always use `pointer(aString)` in my libraries to avoid this hidden call. And I'm *SURE* about that, from the RTL intrinsics (I [rewrote some part](http://synopse.info/forum/viewforum.php?id=6) of it for speed optimization, so I know a little about it). Take a look at the `System.pas` unit. – Arnaud Bouchez May 30 '11 at 16:22
  • 1
    Delphi string are indeed null terminated. It can be checked easily by setting MyVar := 'Hello', calling Setlength(MyVar, 4). If you inspect MyVar, you'll see the "o" has been replaced by a #0 (Even if MyVar was reallocated in-place). – Ken Bourassa May 30 '11 at 16:52
  • @Ken Bourassa, ok i figured out details of lstring management and removed that comment. – Premature Optimization May 30 '11 at 19:29
  • @Downvoter Take a look at this page: http://docwiki.embarcadero.com/RADStudio/en/Internal_Data_Formats#Long_String_Types or "use the source, Luke" - i.e. the System.pas unit. – Arnaud Bouchez Jul 29 '11 at 11:50
2

That would only be usefull when String[3] in unicode versions of Delphi defaults to 3 WideChars. That would supprise me, but in case it is, use:

st: array[1..3] of AnsiChar;
NGLN
  • 43,011
  • 8
  • 105
  • 200
  • 2
    No need for surprises here, `String[3]` is the Pascal short string on all (known) Delphi Unicode versions. Changing that to WideChars would brake a lot of code that specifically used it to store strings into fixed size records. – Cosmin Prund May 30 '11 at 13:34
  • This is silly, if you **have** to do that, just use a shortstring. – Johan May 30 '11 at 13:42
  • @Cosmin Prund: that sounds logical. – NGLN May 30 '11 at 13:57
  • Thanks for pointing out how to convert from string[3] to array[1..3] of AnsiChar. I was going to make an "off-by-one"-error – Sebastian Aug 06 '14 at 10:03
1

The size of an ansistring and unicodestring will grow dynamically. The compiler and runtime code handle all this stuff for you.
See: http://delphi.about.com/od/beginners/l/aa071800a.htm

For a more in depth explanation see: http://www.codexterity.com/delphistrings.htm

The length can be anything from 1 char to 2GB.

Johan
  • 74,508
  • 24
  • 191
  • 319
1

But the old ShortString type, the newer string types in Delphi are dynamic. They grow and shrink as needed. You can preallocate a string to a given length calling SetLength(), useful to avoid re-allocating memory if you have to add data piece by piece to a string you know the final length anyway, but even after that the string can still grow and shrink when data are added or deleted. If you need static strings you can use array[0..n] of chars, whose size won't change dynamically.