-5

Out of curiosity I am trying to determine string length without using properties or methods, this is something that works in C, but still kind of new to this concept of \0.

From what I understood, and I am talking about C, this character is something that is automatically put after setting value to some string, so that it could be determined how much space is needed for storage, but how does this works in C#?

string str = "someText";
int lenght = 0;
while (str[lenght]!='\0')
{
    lenght++;
}
Console.WriteLine(str);
Console.WriteLine("String lenght is : "+ lenght);
Console.ReadLine();
Djordje
  • 437
  • 1
  • 12
  • 24
  • 3
    C# strings are not null terminated. –  Sep 08 '17 at 19:17
  • 3
    Have you tried debugging and tracing through it before asking how it works? It should be pretty obvious what happens... – Broots Waymb Sep 08 '17 at 19:18
  • 2
    Strings have two fields: `int m_stringLength` and `char _firstChar`, they aren't null terminated, as the length is known. **EDIT:** Actually, they are null terminated, this is a comment from the coreclr repo: " // For empty strings, this will be '\0' since // strings are both null-terminated and length prefixed" – José Pedro Sep 08 '17 at 19:18
  • 1
    @LeonardoAlvesMachado It does; the method on the `String` class which gives you the length is the `Length` property. – Tanner Swett Sep 08 '17 at 19:19
  • 2
    In c#, the last character of string is not the null value character as in C. You might want to catch IndexOutOfRangeException to determine the end of the string. – Chetan Sep 08 '17 at 19:20
  • @DangerZone of course it is obvious what happens but that is not the point, as i said i am trying to compare this from c to c# coz of understanding – Djordje Sep 08 '17 at 19:23
  • 1
    @Amy In C# strings *are* null terminated, but that fact is obscured from you. When trying to get the characters of the string you can't get the terminating null character, but if you were to, say, pass the string to unmanaged code that used null terminated strings, you'd actually be able to see the terminating null character. Of course, because C# obscures the null character from you, the OP's code won't work. – Servy Sep 08 '17 at 20:43

4 Answers4

3

Consider a string as an abstract sequence of characters. There are multiple ways to implement such sequence in a computer library. Here are some examples:

  • Null-terminated - A sequence of characters is terminated with a special character value '\0'. C uses this representation.
  • Character-terminated - This is similar to null-terminated, but a different special character is used. This representation is rare these days.
  • Length-prefixed - First few bytes of the string are used to store its length. Pascal used this representation.
  • Length-and-Character Array - This representation stores a string as a structure with an array of characters, and stores the length in a separate area.

Mixed representations are also possible, such as storing null-terminated string inside a length-and-array representation.

C# uses a combination of length-prefixed and null-terminated strings, and also allows null characters to be embedded in the string. However, it does not mean that you could access null terminator or the length bytes, because, unlike C, C# performs bound checks, and throws an exception on an attempt to access past the end of the string.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • "C# uses a combination of length-prefixed and null-terminated strings" Have a source for this statement? The documentation for [String](https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/) says otherwise. –  Sep 08 '17 at 19:42
  • 2
    @Amy This comes from reading reference source [here](https://referencesource.microsoft.com/#mscorlib/system/string.cs) and [here](https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/String.cs). Comments and assertions say that they expect null termination to be there. This is most likely done for compatibility with C libraries and for debugging. – Sergey Kalinichenko Sep 08 '17 at 19:50
  • 1
    Nice find. "For empty strings, this will be '\0' since strings are both null-terminated and length prefixed" –  Sep 08 '17 at 19:53
  • @Amy The documentation doesn't mention that the string is null terminated because it's an implementation detail, not a part of the language (your own runtime might choose not to represent strings that way), and also because the fact that it's null terminated is obscured from you. – Servy Sep 08 '17 at 20:45
  • @Servy But if it's an implementation detail, would it then be incorrect to say C# uses both null-terminated and length-prefixed strings? The question is about the language, not a particular implementation. –  Sep 08 '17 at 20:46
  • @Amy Indeed. You could say that .NET uses null terminated and length prefixed strings. When speaking about C# all you can say is that the representation is an implementation detail. – Servy Sep 08 '17 at 20:49
  • @Servy Hm, I feel that the statement I scrutinized can be worded better to avoid being potentially misleading. Two major implementations of C# use null-terminated strings, but the language itself doesn't. But I defer to your judgment. –  Sep 08 '17 at 20:57
  • @Amy I agree that it's a point that would benefit from being further clarified, because it can be confusing. – Servy Sep 08 '17 at 20:58
  • @dasblinkenlight I am little confused because i don't know what to believe, you provided very nice explanation from Null-terminated to Length-and-Character Array and that was in my question also becouse i said "still kind of new to this concept of \0." **Thank you all for nice documentation.** – Djordje Sep 08 '17 at 21:15
1

A string is an object of type String whose value is text. Internally, the text is stored as a sequential read-only collection of Char objects. There is no null-terminating character at the end of a C# string; therefore a C# string can contain any number of embedded null characters ('\0'). The Length property of a string represents the number of Char objects it contains, not the number of Unicode characters. To access the individual Unicode code points in a string, use the StringInfo object.

https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/

C# Strings do not have a terminating character. Instead their actual length is stored with the text itself. This is called length-prefixed strings.

You can access the length using the Length property.

0

In C, the nul character terminates strings handled as character arrays is a convention that is supported by the standard libraries. It is not part of the language itself. Note that there is nothing to "automatically" put a nul character when setting C string values, it is handled by the library and in the compiler for constants.

C# does not use nul terminated strings so none of this applies.

NetMage
  • 26,163
  • 3
  • 34
  • 55
-1

Here is C# code

    string str = "someText";
    int lenght = 0;

    foreach (char c in str)
    {
        lenght++;
    }
    Console.WriteLine(str);
    Console.WriteLine("String lenght is : "+lenght);
    Console.ReadLine();
vik
  • 57
  • 1
  • 9