4

From my knowledge strings are 1 based in Delphi, 0 position is reserved for the length. I am in charge of an huge application written in D5 and D2006, which is using the copy function by copying from the 0 index, and several colleagues are also coding in this way in this moment. Because this is a Delphi 'magic' function, I believe that even if Copy is used to copy the string from 0 index, behind the scenes it copies it from the position 1.

For me a good practice is to copy a string from the 1st position, not from the 0 position, even the result is the same.

Now, my question is, can be the application affected when passing to other Delphi version by using the copy function from 0 position instead of be used to copy from 1 position?

RBA
  • 12,337
  • 16
  • 79
  • 126
  • 1
    Only old-fashioned ShortString has its length at 0 position. The structure of common-used strings (AnsiString etc) is more complex – MBo Mar 22 '12 at 14:37
  • The length of the string hasn't been in index 0 since Delphi 1, unless you explicitly declare a `ShortString`, as in `var MyShortStr: ShortString;` or `var MyShortString: String[255];'. You and your colleagues should buy a book or read the help file once in a while. :) – Ken White Mar 22 '12 at 15:36

1 Answers1

3

The Delphi RTL ignores you when pass 0 as the Index parameter to Copy for a string. When you pass 0 or less for Index, the RTL uses a value of 1. So what you are doing is benign in that there are no observable differences in behaviour between passing 1 or any value less than 1. However, it is certainly confusing to use 0 as a string index in Delphi and I would recommend not doing so.

In pseudo-code, the implementation of Copy starts like this:

function Copy(s: string; Index, Count: Integer): string;
begin
  if Index<1 then
    Index := 1;
  dec(Index);//convert from 1-based to 0-based indexing
  ....continues

In fact the actual implementation is a little more complex, but the above pseudo-code gives the correct semantics.

Your comment about the length being stored at index 0 is true for old style short strings. But it is not true for long strings. In fact it was this very fact that led to the rather odd situation whereby strings are 1-based, but dynamic arrays, lists etc. are 0-based.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 5
    Benign, as far as it doesn't crash. But dumb and definitely a code-smell. If I was code-reviewing these people, I'd point out how it could lead to all sorts of developer confusion including off-by-one errors. – Warren P Mar 22 '12 at 14:51
  • I wonder how many problems it would cause to change string semantics to be 0-indexed. I find it especially confusing if you have C++ Builder code, because it can use both (Delphi, 1-indexed) `String`s and (C++ std, 0-indexed) `wstring`s... although to be honest, accessing a character by index is a very rare thing to do. From memory, I *think* @Allen Bauer once wrote a blog post suggesting - with no guarantees, I remember the tone being 'just throwing it out there'-ish - that it might be changed one day. – David Mar 22 '12 at 16:07
  • I find the 1-indexed strings to be a major PITA. Even in code examples I've posted here I have made the mistake of indexing a string with '0 to length(stringThing)-1)'. All the other iterators, stringLists, listViews, components, controls etc. etc. are all '0 to thingy.count-1', so why not string chars? – Martin James Mar 22 '12 at 16:32
  • 1
    @DavidM: in C++Builder, the `AnsiString` and `UnicodeString` classes actually do not accept index 0 in their `[]` operator, they will raise an index out of bounds exception. The `WideString` class allows index 0, but that does not access the first character, that accesses a portion of the string's length field instead. Seems some range checking is missing from `WideString`. – Remy Lebeau Mar 22 '12 at 20:09
  • 1
    @remy DavidM didn't say they did. He was talking about std::string and std::wstring. I don't know what you mean about a wide string's length field. Do you mean ShortString? – David Heffernan Mar 22 '12 at 22:48
  • @DavidHerrernan: I was simply commenting on the fact that youcan't use index 0 with `AnsiString` and `UnicodeString` in C++Builder, that's all. No, I really meant `WideString`, which is Delphi/C++Builder's wrapper around a COM `BSTR` string. Like `Ansi/UnicodeString`, a `BSTR` also contains tracking data in front of the character payload, specifically a 4-byte length value. So if you access index 0 of a `WideString`, you are actually accessing the second 2 bytes of that length value, and index -1 accesses the first 2 bytes. – Remy Lebeau Mar 22 '12 at 23:03