2

I've got a few typedef'd types to hold unicode strings, UTF16 which is defined as uint_least16_t, and UTF32 which is defined as uint_least32_t (to be compatible with the standard's definition of char(16|32)_t)

In Xcode's debugger, UTF32 strings are displayed as

I've tried LLDB's formatters, but they're not entirely working.

Here's the latest version of my UTF32 formatter: type format add -f unicode32 -C yes UTF32

Other things I've tried:

type format add -f unicode32s -C yes UTF32

type format add -f unicode32[] -C yes UTF32

type format add -f unicode32* -C yes UTF32

and all of those with (UTF32*) and (UTF32[]) as well as without.

I'm just not sure what what else to try at this point.

Paul Sanders
  • 24,133
  • 4
  • 26
  • 48

1 Answers1

0

Update: I have now been informed that the OP wants to compile for C, rather than C++ and have therefore added the appropriate language tag to the question.

The types in the answer below are also available in C11 onwards, so the rest of this answer still applies.


Try changing your UTF16 typedef to char16_t. This works for me.

For UTF32, either char32_t or wchar_t will work.

Please note that when assigning variables from string literals, the types of the string literals must match, like so (this code requires C11 / C++11 or later):

int main ()
{
    const char16_t *s1 = u"s1 is char16_t";
    const char32_t *s2 = U"s2 is char32_t";
    const wchar_t *s3 = L"s3 is wchar_t";
    return 0;
}

And when I put a breakpoint on the return statement and print these variables in the console, I get this:

(lldb) p s1
(const char16_t *) $0 = 0x0000000100000f44 u"s1 is char16_t"
(lldb) p s2
(const char32_t *) $1 = 0x0000000100000f58 U"s2 is char32_t"
(lldb) p s3
(const wchar_t *) $2 = 0x0000000100000f7c L"s3 is wchar_t"
(lldb) 

You can also see them by hovering the mouse over them or in the 'Locals' window.

The full gamut of string literals available in C11 / C++11 is described here (C) and here (C++).

To answer the question you pose in the comments, no header file is needed. These are all built-in types. If your compiler is objecting to these please check the C / C++ version you are using in the build settings for your project in Xcode.

Paul Sanders
  • 24,133
  • 4
  • 26
  • 48
  • MacOS does not support char16_t or char32_t, hence making those types in the first place. What OS are you using? I'm on High Sierra. –  Jun 18 '18 at 17:22
  • Can you clarify what you mean by "MacOS does not support char16_t or char32_t" please? The _compiler_ certainly supports them. – Paul Sanders Jun 19 '18 at 07:14
  • `uchar.h` is not present on MacOS... char16_t and 32 are not defined. –  Jun 19 '18 at 09:32
  • @watwat Updated my post, hopefully this will clear things up. – Paul Sanders Jun 19 '18 at 19:52
  • To put it another way, you are using the wrong types in your code. The types I recommend are the ones you're supposed to use for Unicode strings. – Paul Sanders Jun 19 '18 at 21:26
  • No they're not. They might be the sane _size_, but they're not the same _type_, see [live demo](https://wandbox.org/permlink/qjpuzI1dadbXmLAO). Just change the types in your program already, you're doing it wrong. The you can accept my answer as it answers your question. – Paul Sanders Jun 20 '18 at 17:31
  • [They both suffer from the same exact problem dude](https://wandbox.org/permlink/k870845n1gonVFZa), they're literally the same. –  Jun 20 '18 at 17:38
  • and that's without mentioning that I LITERALLY can not use char32_t because it's not defined on my platform. –  Jun 20 '18 at 17:40
  • [char32_t is LITERALLY defined by the standard to be uint_least32_t](http://en.cppreference.com/w/c/string/multibyte/char32_t) –  Jun 20 '18 at 17:44
  • #1: Your code is missing the necessary `const` specifiers, that's why both lines fail to compile. #2: As already mentioned, you need to tell Xcode to compile for **C++11 or later** in the build settings for your project. #3: No it isn't. You are still failing to distinguish between the width of a type and what the compiler believes it to represent. A `float` and an `int` are both 32 bits but they represent fundamentally different things. Well, it's the same here. Now please stop arguing the toss with me and do some proper background reading. Thank you. – Paul Sanders Jun 20 '18 at 18:38
  • Why is const necessary? I don't use const strings in my code... I'm writing C, not C++, and Xcode is set to build it as C11 (without gcc extensions). I have read the background, and i'm not trying to argue. You're ignoring the core of my argument, hence this extended argument. UTF32 and char32_t are both defined to be uint_least32_t. how are they different types? they're the same type, just using different aliases. –  Jun 20 '18 at 19:05
  • A, you're writing in _C_, why didn't you say? No wonder you were confused. Any reason not to switch to C++? – Paul Sanders Jun 20 '18 at 19:16
  • Ahh, I figured I did mention I was writing C, I guess I forgot, I do try to mention that whenever I ask questions about code, I looked into C++ for templates, and especially Unicode strings, but it's even more of a clusterfuck in C++ than it is in C. I put a lot of thought into supporting Unicode before I dived in, originally I was gonna do it similar to C++ with a struct that contains various properties, but I want to support string and character literals, which C can not support unless they're a null terminated array, and C++ can, but there's no point shoehorning it in to my C library. –  Jun 20 '18 at 19:21
  • `wchar_t` appears to be supported in C, see https://stackoverflow.com/questions/12996062/how-to-initialize-a-wchar-t-variable. `char16_t` and `char32_t` are C++ only, I believe. In future, please include the relevant language tag(s). People are very hot on that here, and perhaps now you can see why. I have now added it for you. – Paul Sanders Jun 20 '18 at 19:32
  • C supports them as of C11, the problem with wchar_t is that it's utf-16 on windows, and utf-32 on unix based platforms. BTW, C was the first tag... feel free to check the edit history of my post, it was either there at the very first, or within the first few minutes, and you didn't respond until well after that tag was added either way. –  Jun 20 '18 at 19:36
  • 1
    Fine, compile as C11 then, problem solved. And it was me who added that tag, check the history. You need to calm down a bit if you're going to enjoy your time on this site. – Paul Sanders Jun 20 '18 at 20:16
  • I have edited my answer to mention C11. If this works for you, please [accept it](https://stackoverflow.com/help/someone-answers). That is the done thing here. Thank you. – Paul Sanders Jun 21 '18 at 05:00