6

When calling a function that expects a BSTR it'd be nice to be able to write something like:

iFoo->function( bs"HELLO" );

However the only workaround I'm aware of is to use a wrapper that calls SysAllocString etc., e.g.:

iFoo->function( WideString(L"HELLO").c_bstr() );

which is kind of ugly. Is there actually such an option to create a BSTR literal?

Motivation: easier-to-read code, and faster runtime performance by avoiding an allocation and deallocation.

Clarification: I am only talking about situations where the caller (i.e. us) has ownership of the BSTR, for example: calling a function that takes a BSTR [in] parameter. Of course, it would be silly to supply a pointer to a BSTR literal to a function which will go on to try and free the string.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • A very ugly solution is `L"\xA\0" "HELLO" + 2` – M.M Jan 13 '15 at 09:33
  • 1
    Not only ugly, wait until someone calls `SysFreeString()` on that :) – Frédéric Hamidi Jan 13 '15 at 09:35
  • 1
    BSTR's are by-definition dynamic managed. The concept could certainly be tossed into a literal, but it wouldn't be a BSTR. And it would be disastrous if ever used in a place where it was eventually free'd. The OLE libs have *many* places (variant functions, marshaller code, etc) where things like VARIANT members are managed behind the scenes. Placement of something like this in such a place would be disastrous. You could always punt and just use any of the canned BSTR smart pointer classes like `bstr_t` or `CComBSTR`. – WhozCraig Jan 13 '15 at 09:40
  • @FrédéricHamidi Functions that accept BSTR shouldn't be freeing it (i.e. memory is managed by the caller), this is the same as complaining that you can't do `printf("hello");` because `printf` might free the string! – M.M Jan 13 '15 at 10:08
  • @WhozCraig If the called function does free the string (and we did not use the literal I am suggesting) it'd break the program anyway as there would be a double free. – M.M Jan 13 '15 at 10:08
  • 1
    @MattMcNabb Why would you free a BSTR that is being freed already? `VariantChangeType` for example, Called from a loaded `VARIANT` with a "literal" invalid BSTR for "1000" changing to `VT_LONG` would be UB from the rafters. BSTR `[in,out]` marshalling by-definition will free the input `BSTR` and replace it with an output-BSTR *multiple times*. There is **zero** sense in literal BSTRs. I just reread your comment and I *think* you agree. I concur, something being sent a `[in] BSTR` should not be freeing it (except `SysFreeString` of course). – WhozCraig Jan 13 '15 at 10:14
  • @WhozCraig When creating a VARIANT that owns its BSTR you would allocate one. BSTRs have well defined ownership semantics. Only the owner should free the string. There is no `[in, out] BSTR`, only `BSTR *` which has callee-ownership. – M.M Jan 13 '15 at 10:17
  • @MattMcNabb Declaring `[in,out] BSTR` in MIDL will create `BSTR *` in the generated header and proxy/stub (it did last I checked anyway; been awhile). I completely agree its all about ownership. – WhozCraig Jan 13 '15 at 10:20
  • @WhozCraig OK, so when it is a caller-ownership situation then why shouldn't the caller be able to use a literal in order to avoid wasting time with an allocation and deallocation? – M.M Jan 13 '15 at 10:20
  • 2
    Answering my own question... I guess that some memory checker tool might try and check that any pointer supplied to a function expecting BSTRs actually corresponds with something that exists in the OLE allocation table – M.M Jan 13 '15 at 10:26
  • 1
    @MattMcNabb *that* is a good question. Since everything- `BSTR` is supposed to play by the rules, how it is *built* is not up to you; its up to MS. They [document them](http://msdn.microsoft.com/en-us/library/windows/desktop/ms221069(v=vs.85).aspx), which is nice, but they're also free to *change* that. Trying to literalize that is easily, if not more, tedious than just playing by the rules. Ex: your proposed "very ugly" solution is *not* conforming (it doesn't have two terminating nulls). Why would *you* want to do that? And you can toss BSTR-caching, which COM does for you, out entirely. – WhozCraig Jan 13 '15 at 10:27
  • Great brain-food, btw. uptick =P – WhozCraig Jan 13 '15 at 10:31
  • @WhozCraig I always took "two null characters" to mean two null narrow characters (i.e. one null wide character) - because requiring two null wide characters is just too strange! :) – M.M Jan 13 '15 at 10:44
  • It caught me somewhat by surprise as well, since the rest of the documentation freely interchanges "characters" with wide or narrow depending on the context. I wish I had a Windows box to verify, but as memory serves there are 32-bits of nothingness at the end of a valid BSTR. If you have one handy (a windows box) I'm truly curious which it is. Otherwise I'll check tomorrow at work and report back. – WhozCraig Jan 13 '15 at 10:47
  • 2
    @WhozCraig Tried it just now , [here is the result](http://i.imgur.com/lDpXxh9.png) ... – M.M Jan 13 '15 at 11:01
  • Definitely worth noting in your question imho. The ABABAB looks like typical MS debug fill. I humbly apologize for my inaccuracy. (and makes me wonder where *did* I see all those added octets). Now I'm genuinely curious if MS just ignores state-junk they may be keeping when given a non-rule-allocated `BSTR` in their functions. Good question! – WhozCraig Jan 13 '15 at 11:03
  • @WhozCraig AB stands for allocated block i.e. an uninitialized memory block that was allocated with LocalAlloc(). – AndersK Jan 13 '15 at 11:33
  • Similar question: http://stackoverflow.com/questions/20264616/c-create-bstr-at-compile-time-insert-length-into-string-at-compile-time – M.M Feb 15 '15 at 21:22

2 Answers2

4

User defined literals would be the way to go:

"HELLO"_bstr calls template<char...> BSTR operator "" _bstr ( const char*, std::size_t) which can then call SysAllocString()

New in VS14.

[edit]

Based on the comments, it might be better to return a _bstr_t or other class which takes ownership of the SysAllocString() result and implicitly converts to BSTR. This temporary will be destroyed at the end of the full expression, and therefore after iFoo->function( "HELLO"_bstr ); returns.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • 1
    That solves the "ugly syntax" problem, although it causes a memory leak if used like `iFoo->function( "hello"_bstr );`. – M.M Jan 13 '15 at 11:31
  • @MattMcNabb: Isn't the destructor called when `iFoo->function` returns? – TonyK Jan 13 '15 at 11:34
  • @MattMcNabb: I'd have to check the exact rules. It's fixable by returning a temporary which implicitly converts to `BSTR` but frees the string afterwards. But I thought that passing a BSTR implies passing ownership. – MSalters Jan 13 '15 at 13:07
  • 1
    @TonyK: A `BSTR` is just a pointer, there is no destructor to call if the operator returns a `BSTR` directly. You would have to make the operator return an instance of a class instead, and that class could have a `BSTR` conversion operator defined. – Remy Lebeau Jan 13 '15 at 17:53
  • @MSalters: simply passing a `BSTR` does not imply passing ownership. It depends on the contract of the parameter. If the parameter is marked as `[out]` or `[out,retval]`, the function allocates the `BSTR` and passes ownership to the caller. If the parameter is marked as `[in,out]`, the function is allowed to reallocate the source `BSTR`, and the caller maintains ownership of whatever the `BSTR` is set to upon exit. If the parameter is marked as `[in]`, ownership is not changed at all. – Remy Lebeau Jan 13 '15 at 17:55
  • @RemyLebeau: Yikes, that explains why I didn't remember the exact rules. And it also looks like typical C: type-safe in name only, one type with multiple different behaviors. `[out]` is de facto not part of the type system. I don't think it ever makes sense to pass a "BSTR literal" to an `[out]` argument, yet the compiler has no way to prevent it. – MSalters Jan 13 '15 at 18:51
  • 1
    @MSalters: a parameter marked with `[out]` must be passed by address, not by value. So you would not be able to pass a `BSTR` to an `[out]` parameter because it is expecting a `BSTR*` instead. So a custom `operator "" _bstr` that returns a `BSTR` (directly or otherwise) would only be usable with `[in]` parameters anyway. – Remy Lebeau Jan 13 '15 at 18:56
  • @RemyLebeau: Ok, then my idea of a temporary object with implicit conversion to `BSTR` and destruction on exit is indeed safe. – MSalters Jan 13 '15 at 19:08
  • @MSalters: yes, it is. I have added an example to demonstrate it. – Remy Lebeau Jan 13 '15 at 19:17
4

To follow up on @MSalters's answer, a custom user-defined literal could look something like this:

CComBSTR operator "" _bstr (const char* str, std::size_t len)
{
    return CComBSTR(len, str);
}

Then you can do this (as CComBSTR has a BSTR conversion operator defined):

iFoo->function( "HELLO"_bstr );

You can even overload the operator for multiple input string literal types:

CComBSTR operator "" _bstr (const wchar_t* str, std::size_t len)
{
    return CComBSTR(len, str);
}

CComBSTR operator "" _bstr (const char16_t* str, std::size_t len)
{
    return CComBSTR(len, (wchar_t*)str);
}

iFoo->function( L"HELLO"_bstr ); // calls wchar_t* version with UTF-16 encoded data

iFoo->function( u"HELLO"_bstr ); // calls char16_t* version with UTF-16 encoded data

iFoo->function( u8"HELLO"_bstr ); // calls char* version with UTF-8 encoded data...

Note the last case. Since the operator will not know whether it is being passed ANSI or UTF-8 data, and CComBSTR assumes ANSI when passed char* data, you should use a different literal suffix to differentiate so you can convert the UTF-8 correctly, eg:

CComBSTR operator "" _utf8bstr (const char* str, std::size_t len)
{
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t> conv;
    std::wstring wstr = conv.from_bytes(std::string(str, len));
    return CComBSTR(wstr.length(), wstr.c_str());
}

iFoo->function( u8"HELLO"_utf8bstr );
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770