2

I'm currently using iconv to convert documents with different encodings.

The iconv() function has the following prototype:

size_t iconv (
  iconv_t cd,
  const char* * inbuf,
  size_t * inbytesleft,
  char* * outbuf,
  size_t * outbytesleft
);

So far, I only had to convert buffers of type char* but I also realized I could have to convert buffers of type wchar_t*. In fact, iconv even has a dedicated encoding name "wchar_t" for such buffers: this encoding adapts to the operating system settings: that is, on my computers, it refers to UCS-2 on Windows and to UTF-32 on Linux.

But here lies the problem: if I have a buffer of wchar_t* I can reinterpret_cast it to a buffer of char* to use it in iconv, but then I face implementation defined behavior: I cannot be sure that the all compilers will behave the same regarding the cast.

What should I do here ?

ereOn
  • 53,676
  • 39
  • 161
  • 238
  • In fact, the WCHAR encoding is crucial if you want to translate between the unspecified "system encoding" that you get after `mbstowcs()` and a definite encoding... – Kerrek SB Sep 03 '11 at 15:51

1 Answers1

3

reinterpret_cast<char const*> is safe and not implementation defined, at least not on any real implementations.

The language explicitly allows any object to be reinterpreted as an array of characters and the way you get that array of characters is using reinterpret_cast.

James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • Thank you for clearing my thoughts. I assumed converting the `char*` should be safe enough, but I needed to make sure of that. – ereOn Sep 03 '11 at 16:11