14

Just making sure that it is indeed a bug and not something I might have misunderstood about the functionality of std::quoted

Here is the code that should in my opinion escape double quotes with double quotes and then unescape them back to the original string:

#include <iostream>
#include <iomanip>
#include <string>
#include <sstream>

int main()
{
    std::string s = R"(something"something)";
    std::cout << "Original: \t" << s << std::endl;

    std::ostringstream oss;
    oss << std::quoted(s, '"', '"');
    std::string s1 = oss.str();
    std::cout << "Quoted: \t" << s1 << std::endl;

    std::istringstream iss(s1);
    std::string s2;
    iss >> std::quoted(s2, '"', '"');
    std::cout << "Unquoted: \t" << s2 << std::endl;

    return 0;
}

The expected output:

Original:   something"something
Quoted:     "something""something"
Unquoted:   something"something

However this is what I get in VS2017 15.6.6:

Original:   something"something
Quoted:     "something""something"
Unquoted:   something

Could anyone confirm this is a bug?

UPDATE:

Good news everyone. The ticket I filed with MS got marked as fixed.

Killzone Kid
  • 6,171
  • 3
  • 17
  • 37
  • 3
    Both libstdc++ and libc++ exhibit the behavior you expect, but I'm not sure from my reading of [the standard](http://eel.is/c++draft/quoted.manip#3) what the expected behavior when `delim == escape` is. I don't see anything saying to prefer considering a char as a delimiter or as an escape. VS2017 seems to prefer a delimiter, whereas libstdc++ and libc++ don't. – Justin Apr 26 '18 at 17:15
  • @Justin Thank you for the reference – Killzone Kid Apr 26 '18 at 17:34
  • 3
    I'd say that it's a defect in the standard, the case `delim==escape` is quite common, it should be at very least specified in some way. – Matteo Italia Apr 26 '18 at 17:35

1 Answers1

5

After further research into this problem I would like to answer my own question. Thanks to @Justin and @MatteoItalia for the insightful comments. In short, this is not a bug but rather an undefined behaviour, as VC++ seems to follow the standard to the dot, while other compilers taking liberties to interpret it in own way, because there is no strict guideline on how to deal with cases of delim == escape, and this paragraph doesn't explain it:

Until an unescaped delim character is reached or !in, extract characters from in and append them to s, except that if an escape is reached, ignore it and append the next character to s.

This is exactly what is happening with VC++, the first double quote " after something is interpreted as unescaped delim character rather than escape character, which terminates the routine. IMO it needs clarification, something like: If escape character is found, check if it is followed by delim character and if so, discard escape character. Here is original proposal I think.

I have filed a report at MS bug tracker, but not much hope for the fix TBH. Then I found this ticket where @Barry suggests to use existing implementation here and here from GCC, and this is exactly what I did.

This works as expected for me, but if you have any concerns or suggestions please do share.

T.C.
  • 133,968
  • 17
  • 288
  • 421
Killzone Kid
  • 6,171
  • 3
  • 17
  • 37
  • Ensure you fix the names used by libstdc++. `_Quoted_string` is a reserved identifier (similarly, names like `__something` are reserved). Also, be sure you adhere to the license of libstdc++ (it [seems to be LGPL](https://github.com/gcc-mirror/gcc/blob/master/COPYING.LIB)) – Justin Apr 27 '18 at 01:33
  • If you want a more permissive license, use libc++; it's dual MIT + UIUC: https://libcxx.llvm.org/ – Justin Apr 27 '18 at 01:36
  • 1
    A plain reading of that paragraph actually makes it clear that VC++ is doing the wrong thing when it encounters the `""`; the ambiguity comes when you reach the *end* of the stream. "Until A do B, except if C do D" is defining a general rule w/an exceptional case that can be tested for and must override A/B. The defects are that: 1. The behavior when the escape is the last character in a stream is undefined (there is no "next character") and 2. as you noted, by making `escape` escape *anything*, not just `delim`, using the same character for both means an "open quote" quotes until (past?) EOF. – ShadowRanger May 02 '18 at 15:08