The quotes are a presentation thing. Once you parse/tokenize the data, you want the unescaped data back.
The quoted/escaped representation is to protect special characters in your data in transit only (to prevent them from interfering with your protocol¹).
Once you read it back, it is no longer in transit, and to "keep" the escapes or quotes (or whatever other artefacts come with your protocol¹) would be an error, and in fact is a frequent source of bugs, not seldom security vulnerabilities
Samples
- CSV
a
or "a"
corresponds to a value of a
- likewise
"\""
corresponds to "
"\\\""
corresponds to \"
"\"
is incomplete (the quoted construct is not closed)
The important thing is that your values roundtrip without loss of information. So, parsing "a"
as the value "a"
creates the conceptual error that converting it back to quoted-escaped format would suddenly look like "\"a\""
, which is an entirely different thing!
¹ presentation format or transport protocol
² most commonly, code injection:
Code injection vulnerabilities (injection flaws) occur when an application sends untrusted data to an interpreter. Injection flaws are most often found in SQL, LDAP, XPath, or NoSQL queries; OS commands; XML parsers, SMTP headers, program arguments, etc. Injection flaws tend to be easier to discover when examining source code than via testing.[1] Scanners and fuzzers can help find injection flaws.[2]