1

A user-defined literal suffix in C++0x should be an identifier that

  • starts with _ (underscore) (17.6.4.3.5)
  • should not begin with _ followed by uppercase letter (17.6.4.3.2)

    Each name that [...] begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.

Is there any reason, why such a suffix may not start _ followed by a digit? I.E. _4 or _3musketeers?

Musketeer dartagnan = "d'Artagnan"_3musketeers;
int num = 123123_4; // to be interpreted in base4 system?
string s = "gdDadndJdOhsl2"_64; // base64decoder
towi
  • 21,587
  • 28
  • 106
  • 187
  • 2
    The spec says "Each name that ...". A user defined literal operator name is like `operator "" _Foo`. There is no name in here that is spelled like `_Foo`, so 17.6.4.3.2 does not apply. I also wonder, from where do you read that a suffix may not start with `_` followed by a digit? I haven't found such a rule. – Johannes Schaub - litb Apr 25 '11 at 11:46
  • Hmm, I think I overlooked that the library also allows an impl to use `_Foo` as a macro, and so using `operator "" _Foo` is not safe to use. – Johannes Schaub - litb Apr 25 '11 at 14:56
  • I gave the references there. `_Xxx` and `__anything` is reserved for implementation use everywhere, `_anything` in the global namespace. – towi Apr 26 '11 at 07:20
  • the last one I disagree. You can say `operator "" _anything` in the global namespace, because I don't think "the global namespace" includes macros. Macros aren't "global". Only "_Anything" and "__anything" is reserved for "any use". – Johannes Schaub - litb Apr 26 '11 at 12:55

4 Answers4

1

"can" vs "may".
can denotes ability where may denotes permission.

Is there a reason why you would not have permission to the start a user-defined literal suffix with _ followed by a digit?

Permission implies coding standards or best-practices. The examples you provides seem to show that _\d would fine suffixes if used correctly (to denote numeric base). Unfortunately your question can't have a well thought out answer as no one has experience with this new language feature yet.

Just to be clear user-defined literal suffixes can start with _\d.

deft_code
  • 57,255
  • 29
  • 141
  • 224
  • Excuse my sloppy formulation. Of course, in a std text "may" and "can" have strict meaning. As far as I know, there *can not* be a best-practice yet, as there is still a compiler to come which supports user-defined literals. At least gcc 4.7.0 doesn't. But good to know that you also read into the text that `_\d` is ... "*valid*". And from the other constraints I can not see "*discourage*" either. – towi Apr 26 '11 at 07:15
1

The precedent for identifiers of the form _<number> is the function argument placeholder object mechanism in std::placeholders (§20.8.9.1.3), which defines an implementation-defined number of such symbols.

This is a good thing, because it means the user cannot #define any identifier of that form. §17.6.4.3.1/1:

A translation unit that includes a standard library header shall not #define or #undef names declared in any standard library header.

The name of the user-defined literal function is operator "" _123, not simply _123, so there is no direct conflict between your name and the library name if presence of the using namespace std::placeholders;.

My 2¢, though, is that you would be better off with an operator "" _baseconv and encoding the base within the literal, "123123_4"_baseconv.

Edit: Looking at Johannes' (deleted) answer, there is There may be concern that _123 could be used as a macro by the implementation. This is certainly the realm of theory, as the implementation would have little to gain by such preprocessor use. Furthermore, if I'm not mistaken, the reason for hiding these symbols in std::placeholders, not std itself, is that such names are more likely to be used by the user, such as by inclusion of Boost Bind (which does not hide them inside a named namespace).

The tokens are not reserved for use by the implementation globally (17.6.4.3.2), and there is precedent for their use, so they are at least as safe as, say, forward.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • I meant that my concern was whether it could use `_N` as a macro (I mean `N` literally, not a number), because `_N` or `__n` is "reserved for any use". For example, for sure a `operator "" _WIN32` is not a good idea, likewise for my trailing example using `_Vector`. :) – Johannes Schaub - litb Apr 26 '11 at 20:46
  • @Johannes: Oh. How is that relevant? – Potatoswatter Apr 26 '11 at 21:12
  • @Potatoswatter I thought you are saying that `operator "" _2` is fine. I want to clarify that I didn't mean to say that `operator "" _2` is forbidden and that I didn't mean to say that there might be a macro for `_2`. I.e you are saying "Looking at Johannes' (deleted) answer, there is concern that _123 could be used as a macro by the implementation.", just wanted to clarify that the concern is not coming from me if there is such concern. – Johannes Schaub - litb Apr 26 '11 at 21:23
  • @Johannes: OK, fixed that. But why do you consider `_N` relevant? – Potatoswatter Apr 26 '11 at 21:27
  • @Potatoswatter I meant `_N` as a placeholder for anything starting with a underscore followed by an uppercase letter. – Johannes Schaub - litb Apr 26 '11 at 21:28
  • @Johannes: Right, but OP isn't suggesting doing that, as far as I can tell. – Potatoswatter Apr 26 '11 at 21:47
  • Good point, I hadn't thought of *placeholders* `_1`, `_2` and so on. And your reference to *§17.6.4.3.1/1*, too. But are those not in `std::`?. That leaves `_2bedone` valid, not so bad, right? – towi Apr 27 '11 at 07:12
  • @towi: They are named `std::placeholders::_1`, etc. This is a change from Boost which put them in the global namespace (actually an anonymous namespace under global). Yes, I suppose that is a valid identifier for a literal operator or anything else :vP . – Potatoswatter Apr 27 '11 at 07:20
1

An underscore followed by a digit is a legal user-defined literal suffix. The function signature would be: operator"" _4(); so it couldn;t get eaten by a placeholder. The literal would be a single preprocessor token: 123123_4; so the _4 would not get clobbered by a placeholder or a preprocessor symbol.

My reading of 17.6.4.3.5 is that suffixes not containing a leading underscore risk collision with the implementation or future library additions. They also collide with existing suffixes: F, L, ULL, etc. One of the rationales for user-defined literals is that a new type (such as decimals for example) could be defined as a pure library extension including literals with suffuxes d, df, dl.

Then there's the question of style and readability. Personally, I think I would loose sight of the suffix 1234_3; Maybe, maybe not.

Finally, there was some idea that didn't make it into the standard (but I kind of like) to have _ be a literal separator for numbers like in Ada and Ruby. So you could have 123_456_789 to visually separate thousands for example. Your suffix would break if that ever went through.

emsr
  • 15,539
  • 6
  • 49
  • 62
  • I often write `10*1000*1000*1000` to visualize large numbers (with lots of `0`s). But don't write `123*456*789`, though :-) – towi Apr 27 '11 at 07:18
  • `123123_4` is one preprocessor token, very good point! but `"123123"_4` isn't. So, no help there. But I still think, because the placeholders are in `std::` we should not have a problem here. I still don't know what about `_` in *global* namespace (not in `std::`. – towi Apr 27 '11 at 07:20
  • According to the standard and according to the implementation that is being cooked up in gcc "123123"_4 *will* be a new preprocessor token. – emsr Apr 27 '11 at 19:37
  • Also, user-defined operator with suffix _5 does not insert '_5' by itself into global namespace but rather 'operator"" _5()' or perhaps 'template operator"" _5()'. The _5 will be mangled (in addition to arguments and scope). – emsr Apr 27 '11 at 19:40
1

I knew I had some papers on this subject: Digital Separators describes a proposal to use _ as a digit separator in numeric literals

Ambiguity and Insecurity with User-Defined literals Describes the evolution of ideas about literal suffix naming and namespace reservation and efforts to deconflict user-defined literals against a future digit separator.

It just doesn't look that good for the _ digit separator.

I had an idea though: how about either a backslash or a backtick for digit separator? It isn't as nice as _ but I don't think there would be any collision as long as the backslash was inside the stream of digits. The backtick has no lexical use currently that I know of.

i = 123\456\789;
j = 0xface\beef;

or

i = 123`456`789;
j = 0xface`beef;

This would leave _123 as a literal suffix.

emsr
  • 15,539
  • 6
  • 49
  • 62