C++ how convert wide string to base64?

Question

What is the best way to convert wide string to base64?

Ben Voigt · Accepted Answer · 2011-05-23T15:51:27.163

6

Octet (8 bit symbols) -> Base64 (6 bit symbols) conversion works on bytes, not characters, so it works the same way independent of your string encoding.

To be clear: Base64 is not a character encoding. Sender and receiver need to agree on the character encoding (ASCII, UTF-8, UTF-16, UCS-2, etc) as well as the transport method (Base64, gzip, etc).

edited May 23 '11 at 15:51

answered May 23 '11 at 14:08

Ben Voigt

277,958
43
419
720

1

To clarify, since a `wchar_t` is not an octet, you have to convert wide strings to arrays of octets before base64 encoding. – Dietrich Epp May 23 '11 at 14:13
To clarify further, you need to decide whether to convert your wide string to an intermediate form like UTF-8 before encoding to Base64, or to just skip that, typecast the `wchar_t*` to `const char*` and encode to Base64. – Mike DeSimone May 23 '11 at 14:18
@Dietrich, @Mike: A well-designed conversion API would be taking a `void*`. The size of the chunks the API processes at a time is completely an implementation detail, and it might very well use 16- or 32-bit words internally. `void*` is the right type for binary data (see also `fread`, `memcpy`). – Ben Voigt May 23 '11 at 14:24
@Ben Voigt: But a `wchar_t` is not portable, so if you reverse the encoding on a different platform, you will get a mangled string. Some platforms have 16-bit, others have 32-bit `wchar_t`. Some are big or little endian. Since `wchar_t` is not byte oriented, it should not be base64 encoded, whether or not the API generates a type error. – Dietrich Epp May 23 '11 at 14:35
@Dietrich: That's true, but no one but you is talking about `wchar_t`. Regardless, Base64 preserves the encoding whatever it is. Your comments do apply to Kirill's answer though. But I've added to my answer to clarify this. – Ben Voigt May 23 '11 at 15:50
@Ben Voigt: The question title is "wide string", or does that mean something else besides a string of `wchar_t`...? – Dietrich Epp May 23 '11 at 22:29
@Dietrich: It could mean that, or `char16_t`, and `char32_t`, or any other string with characters outside the ASCII range. If we're being generous, even UTF-8 could qualify. – Ben Voigt May 23 '11 at 23:44
@Ben: "Wide string" is defined as a sequence of wide characters in the relevant standard. "Wide character" is defined as `wchar_t`. The terms "multibyte character" or "multibyte string" are used when speaking of `char16_t` and `char32_t`, strings of which may be *initialized with* wide string literals. I have never heard the term "wide string" outside the C/C++ community, so I use the definition from the C/C++ standards. – Dietrich Epp May 24 '11 at 03:36

Kirill V. Lyadvinsky · Answer 2 · 2011-05-23T14:18:06.467

1

To encode some data to base64 you can use Base64 class from the Xerces library. It could look like the following:

std::wstring input_string = SOME; // some wide string
// keep it in contiguous memory (the following string is not needed in C++0x)
std::vector<wchar_t> raw_str( input_string.begin(), input_string.end() );

XMLSize_t len;
XMLByte* data_encoded = xercesc::Base64::encode( reinterpret_cast<const XMLByte*>(&raw_str[0]), raw_str.size()*sizeof(wchar_t), &len );
XMLCh* text_encoded = xercesc::XMLString::transcode( reinterpret_cast<char*>(data_encoded) );

// here's text_encoded is encoded text
// do some with text_encoded

XMLString::release( &text_encoded );
XMLString::release( reinterpret_cast<char**>(&data_encoded) );

edited May 23 '11 at 14:18

answered May 23 '11 at 14:08

Kirill V. Lyadvinsky

97,037
24
136
212

This is a solution where no intermediate form such as UTF-8 is used. – Mike DeSimone May 23 '11 at 14:20
Very useful code showing how to release the memory allocated by xercesc – Damian Nov 17 '17 at 01:15

score 0 · Answer 3 · answered May 23 '11 at 14:14

0

If you are using Visual C++ with MFC, there is already a library to do this. Check out Base64Encode and Base64Decode.

answered May 23 '11 at 14:14

Jonathan Wood

65,341
71
269
466

C++ how convert wide string to base64?

3 Answers3