-2

Why do strings in almost all languages require that you escape the quotations?

for instance if you have a string such as

"hello world""

why do languages want you to write it as

"hello world\""

Do you not only require that the string starts and ends with a quotation?

You can treat the end quote as the terminating quote for the string. If there is no end quote then there is an error. You can also assume that a string starts and ends on a single line and does not span multiple lines.

Har
  • 3,727
  • 10
  • 41
  • 75
  • As a human being, how do you know which quotation mark is the closing one? Right... let alone a machine. – revo May 26 '17 at 16:34
  • you are right, a string must start and finish with quotation marks (or ', depending on the language). Writing "hello world\"" will cause *hello world"* to be printed, you do realise? – LJH May 26 '17 at 16:38
  • Quotes are delimiters to something, doesn't have to be a language, could be a csv file. The bottom line is that _delimiters_ are used to parse every aspect of source languages. Why do ask this question, writing a new language ? –  May 26 '17 at 16:39

5 Answers5

1

How would the compiler know which quote ended the string?

UPDATE:

In C & C++, this is a perfectly fine string:

printf("Hel"   "lo" "," "Wor""ld"  "!");

It prints Hello, World!

Or how 'bout is C#

Console.WriteLine("Hello, "+"World!");

Now should that print Hello, World or Hello, "+"World! ?

James Curran
  • 101,701
  • 37
  • 181
  • 258
  • Last quote ends the string, if last quote is not present then there is an error – Har May 26 '17 at 16:35
  • 1
    Then how interpreter should identify *last quote* in code snippet below: `hello = "Hello, "; world = " world!";`? @Har – revo May 26 '17 at 16:42
  • Yes, I am making the assumption that the language is line based and can not span multiple lines or have multiple statements per line, however even in the example above, following the rules as above, there would be one string, that would work... – Har May 26 '17 at 19:08
  • Good example just like @Daniel H answer, things turn ambiguous as soon as you have multiple strings on the same line. – Har May 26 '17 at 19:16
1

Otherwise, the compiler would see the second quotation mark as the end of you string, and then a random quotation mark following it, causing an error.

"The use of the word "escape" really means to temporarily escape out of parsing the text and into a another mode where the subsequent character is treated differently." Source: https://softwareengineering.stackexchange.com/questions/112731/what-does-backslash-escape-character-really-escape

LJH
  • 7,444
  • 3
  • 10
  • 19
1

Suppose I want to put ", " into a string literal (so the literal contains quotes).

If I did that without escaping, I’d write "", "". This looks like two empty string literals separated by a comma. If I want to, for example, call a function with this string literal, I would write f("", ""). This looks to the compiler like I am passing two arguments, both empty strings. How can it know the difference?

The answer is, it can’t. Perhaps in simple cases like "hello world"", it might be able to figure it out, for at least some languages. But the set of strings which were unambiguous and didn’t need escaping would be different for different languages and it would be hard to keep track of which was which, and for any language there would be some ambiguous case which would need escaping anyway. It is much easier for the compiler writer to skip all those edge cases and just always require you to escape quotation marks, and it is probably also easier for the programmer.

Daniel H
  • 7,223
  • 2
  • 26
  • 41
  • Very good point, so if you had a language that supported multiple strings on the same line, this would cause an ambiguity. Thanks for the insight :) – Har May 26 '17 at 19:14
0

The reason you have to escape the second quotation mark is so the compiler knows that the quotation mark is part of the string, and not a terminator. If you weren't escaping it, the compiler would only pick up hello world rather than hello world"

Austin Parker
  • 323
  • 2
  • 6
0

Lets do a practical example.

How should this be translated?

"Hello"+"World"
    'HelloWorld' or 'Hello"+"World'
vs
"Hello\"+\"World"

By escaping the quote characters, you remove the ambiguity, and code should have 0 ambiguity to the compiler. All compilers should compile the same code to identical executable's. It's basically a way of telling the compiler "I know this looks weird, but I really mean that this is how it should look"

Tezra
  • 8,463
  • 3
  • 31
  • 68
  • Compilable code always has 0 ambigity to the compiler -- as I'm sure the compiler would always know which of those two translations to choose.. The problem is ambiguity to the human reader. – James Curran May 26 '17 at 20:22
  • @JamesCurran for a **Specific** compiler, yes. But any ambiguity in how it should be interpreted means different compilers may come to different conclusions about what you meant. – Tezra May 26 '17 at 20:27