5

I'm porting an application from using char* for everything and everywhere to using UCS4 as it's internal Unicode representation. I use C11's U"unicode literals" for defining strings, which expand to arrays of char32_t, which are uint32_t essentially.

Problem is with properly annotating printf-like functions. As "format" is no longer char*, compiler refuses to compile it further, as well it won't be happy with char32_t * instead of char * for %s format, I suppose.

I don't depend on stdlib *printf family at all, so formatting is done purely by mine implementation.

What is correct solution for this, other than just disable this attribute altogether?

hippietrail
  • 15,848
  • 18
  • 99
  • 158
toriningen
  • 7,196
  • 3
  • 46
  • 68
  • A side-question: What benefit do you think to gain from using UTF-32 instead of UTF-8? And are you really sure that's worth it? (UTF-32 has multi-codepoint glyphs too.) – Deduplicator Nov 04 '14 at 18:53
  • 1
    My application operates solely on codepoints, so there is really no point for me to consider grapheme clusters, user-perceived characters and such. UCS4 greatly simplifies string processing as for now, as I can reuse most of existing codebase, and I will migrate internal representation to UTF8 in next iteration. – toriningen Nov 04 '14 at 18:58
  • I also seem to miss the point of `U"..."` stuff, seems like a complicated step, in particular since C11 adds only minor support to handle these. You could just use the `"\u2002"` notation to implement all Unicode code points that you need as mbs. For the question itself, you should probably ask the gcc people directly. This is nothing very common, so you really need their expertise on the question. – Jens Gustedt Nov 04 '14 at 20:41

1 Answers1

1

There is currently no way to do this in GCC. It is a known bug, see GCC bug 64862

Tom Tromey
  • 21,507
  • 2
  • 45
  • 63