0

I am having an issue std::string assignment with the below code in VS2019, where it is working fine in VS2015. I am using Visual Studio 2019 (v142) as Platform Toolset in VS2019 and Visual Studio 2015 (v140) as Platform Toolset in VS2015.

class Test
{
public:
    Test()
    {
        m_DisplayName = "";
        m_Type = "";
    }
    std::string m_DisplayName;
    std::string m_Type;
};

void CMFCApplication2Dlg::TestFunction()
{
    // Sample Data setting in vector
    std::vector<Test> vTest;
    Test obj;
    obj.m_DisplayName = "Nam1gd";
    obj.m_Type = "t1";
    vTest.emplace_back(obj);

    Test obj2;
    obj2.m_DisplayName = "Nam2";
    obj2.m_Type = "t2";
    vTest.emplace_back(obj2);

    VARIANT vtResult_o;
    VariantInit(&vtResult_o);
    vtResult_o.vt = VT_ARRAY | VT_UI1;
    int usrCount = vTest.size();
    ULONG nBufferSize = usrCount * sizeof(Test);

    // Define a safe array of usrCount Item and Starting index as 0 
    SAFEARRAYBOUND safeBounds = { nBufferSize, 0 };

    //Create the Safe Array passing it the bounds 
    SAFEARRAY* pSafeArray = SafeArrayCreate(VT_UI1, 1, &safeBounds);
    Test* vTestArray = nullptr;
    SafeArrayAccessData(pSafeArray, (void**)&vTestArray);

    // Data setting to Safe array
    for (int nIdx = 0; nIdx < usrCount; nIdx++)
    {
        // ******VALUES ASSIGNING CORRECTLY in VS 2015 ******/
        // ******JUNK VALUE COMES IN THIS ASSIGNMENT in VS 2019 ******/
        vTestArray[nIdx].m_DisplayName = vTest[nIdx].m_DisplayName;
        vTestArray[nIdx].m_Type = vTest[nIdx].m_Type;
    }

    /* decrement the lock count*/
    SafeArrayUnaccessData(pSafeArray);
    vtResult_o.parray = pSafeArray;
}

Since the string value in the vTest[nIdx].m_DisplayName is small it is stored in the small buffer in s._Bx._Buf. In VS2015 while assigning, the small buffer in the source string is copied to the destination string's small buffer.

 for (int nIdx = 0; nIdx < usrCount; nIdx++)
 {
     // ******VALUES ASSIGNING CORRECTLY in VS 2015 ******/
     // ******JUNK VALUE COMES IN THIS ASSIGNMENT in VS 2019 ******/
     vTestArray[nIdx].m_DisplayName = vTest[nIdx].m_DisplayName;
     vTestArray[nIdx].m_Type = vTest[nIdx].m_Type;
 }

Source string However, in VS2019, it is observed that the during assignment the destination's buffer ( s._Bx._Buf) is updated with junk value and the actual value is updated to the heap s._Bx._ptr.

Ideally since s._Bx is a union either the s._Bx._Buf or s._Bx._ptr should be present. Please find the image below. Destination

[Workaround]

However if I do a std::string casting in VS2019 it is assigning the std::string's small buffer in the source to the small buffer in the destination.

for (int nIdx = 0; nIdx < usrCount; nIdx++)
{
    vTestArray[nIdx].m_DisplayName =(std::string)(vTest[nIdx].m_DisplayName);
    vTestArray[nIdx].m_Type = (std::string)(vTest[nIdx].m_Type);
}

I will appreciate if some one can help me to understand why this difference in VS2015 and VS2019!

aks
  • 302
  • 2
  • 10
  • 1
    So your problem is that the assignment doesn't use the small string optimization? Everything works though? – Goswin von Brederlow Jun 16 '22 at 21:01
  • 1
    I think the problem is that your code doesn't construct the `Test` objects in your 'safe array'. Perhaps you need to use placement new. – john Jun 16 '22 at 21:04
  • What does `SafeArrayAccessData` do? It's 99.9% UB. – Goswin von Brederlow Jun 16 '22 at 21:04
  • Same question about `SafeArrayCreate`. And pretty muich all the rest of the code not shown. Maybe start learning modern C++ like `std::span` insteadof SAFEARRAYBOUND. And `new` instead of whatever magic you use to allocate arrays. – Goswin von Brederlow Jun 16 '22 at 21:07
  • 2
    Safe array seems to be ironically named. – john Jun 16 '22 at 21:08
  • Do you realize that `vTest.emplace_back(obj);` is pointless? It can't construct the obejct in-place, it can't even move the data into place. It has to copy `obj`. You should write a proper constructor for `Test` with member initialization list and then `vTest.emplace_back("Nam1gd", "t1");` – Goswin von Brederlow Jun 16 '22 at 21:13
  • 1
    _"updated with junk value"_ - The debugger is showing you the value of the inactive `union` member. It's not junk. – Ted Lyngmo Jun 16 '22 at 21:22
  • `Test` contains `std::string` objects, which themselves contain pointers to outside memory. What is the point of storing `Test` objects in a byte array? If you are trying to *serialize* the `Test` objects (for transmission, storage, etc), this is not the correct way to go about doing it. – Remy Lebeau Jun 16 '22 at 21:24
  • I understood there are better ways do it. This a representation of big code base, recently ported from VS2015 to VS2019. My specific doubt is why the same code worked in 2015 and not working in 2019. – aks Jun 17 '22 at 02:26

1 Answers1

3

You are treating the array you obtained as if it contained alive Test objects, which it doesn't. That causes undefined behavior.

Instead of assigning directly, which pretends that there is already an alive object, you should use placement-new to create and start the lifetime of the Test object explicitly first:

auto obj = new(vTestArray + nIdx) Test;
obj->m_DisplayName = vTest[nIdx].m_DisplayName;
obj->m_Type = vTest[nIdx].m_Type;

and don't forget to destroy the objects when you are done with them with explicit destructor calls:

for (int nIdx = 0; nIdx < usrCount; nIdx++) {
    vTestArray[nIdx].~Test();
}

(Technically there might be some minor issue here requiring some additional std::launder call on vTestArray if we read the standard strictly and SafeArrayAccessData would not be considered to produce a "pointer to a suitable created object" under the C++20 implicit object creation rules.)


There is also a more minor issue in that the cast in SafeArrayAccessData(pSafeArray, (void**)&vTestArray); is wrong. You cannot reinterpret a pointer to Test* as a pointer to void*. That is an aliasing violation. You should use an intermediate pointer of the correct type:

void* vTestArrayVoid = nullptr;
SafeArrayAccessData(pSafeArray, &vTestArray);
auto vTestArray = reinterpret_cast<Test*>(vTestArrayVoid);

(Please note that there should not be a std::launder call here against my advice in a previous edit, since there is no Test object in its lifetime yet.)


I am also not sure what the intended use of the array is, but you should be aware that the strings will generally allocate additional memory that is not part of the array. So this is not a reliable way to e.g. share memory between processes.

But if you don't have some specific use case in mind (and I might just not be aware here of the obvious if it is Microsoft-specific) then there would be no reason to make it so complicated when std::vector, potentially reference-counted via std::shared_ptr, already does all of this automatically.

Doing it manually in the way I suggested is also not exception-safe, which a std::vector would be. If an exception is thrown during construction of one of the Test objects, then you will forget the destruction of the already-constructed Test objects with this naive implementation.

user17732522
  • 53,019
  • 2
  • 56
  • 105
  • I understood there are better ways do it. This a representation of big code base, recently ported from VS2015 to VS2019. My specific doubt is why the same code worked in 2015 and not working in 2019. – aks Jun 17 '22 at 02:26
  • There's been numerous improvements to the conformance of the Visual C/C++ Library since VS 2015. The newer versions are 'binary compatible' all the way back to VS 2015 Update 3, but there's no guarantee that "undefined behavior" is the same. – Chuck Walbourn Jun 17 '22 at 02:47
  • @ChuckWalbourn Was your comment meant to address aks? Otherwise I am not sure whether you are just adding information or if there was some implied issue with the answer. – user17732522 Jun 17 '22 at 03:57
  • @aks I don't know about the MSVC and Microsoft STL internals. But even if you found an explanation, e.g. either that the optimizer was improved to perform some specific optimization in this case changing the behavior or that the `std::string` implementation was modified to make the small-string allocation decision differently, what is that going to help you? The code is broken either way and needs to be fixed properly. With workarounds like you have shown you are at best delaying the issue until the next update. – user17732522 Jun 17 '22 at 04:00
  • I will make the question more simple. C++14 platform toolset. struct structTest { int a = 0; std::string m_str; }; void TestFunc() { structTest st1; memset(&st1, 0, sizeof(st1)); st1.a = 10; st1.m_str = "ABCD"; } In VS2015, the st1.m_str is set to "ABCD". But in VS2019 it is junk. If I set value of length greater than 15( s._Bx._ptr) then no issues. I am seeing issues for the short strings only(s._Bx._Buf). I know we can use proper initialization instead of memset. However I am trying to understand is the actual reason for this difference. – aks Jun 21 '22 at 13:51
  • @aks The `memset` has undefined behavior because `std::string` is not a trivially-copyable type. Even if we ignore that, you are zeroing all members, including the field which indicates the reserved memory size for the string. The class uses that field to determine whether or not SSO is used and if the new string is larger than it, it will _always_ allocate memory dynamically, see https://github.com/microsoft/STL/blob/ee43e7097e9c48326c523af2ce3b0da8671339b7/stl/inc/xstring#L3423. I can't tell you what was different in VS2015, because that is from before Microsoft published the STL code. – user17732522 Jun 21 '22 at 14:17