Efficiently convert an unsigned short to a char*

Question

What would be an efficient, portable way to convert a unsigned short to a char* (i.e. convert 25 to '25').

I'd like to avoid things such as getting (std::string) strings involved. Performance is important in this case since this conversion will need to happen quickly and often.

I was looking into things such as using sprintf but would like to explorer any and all ideas.

Have you tried using `sprintf`/`snprintf`? Having done so, have you profiled the code and determined that this is a performance hotspot? — James McNellis, Jan 18 '11 at 21:26
The tables at the bottom of the article linked below clearly illustrate where stdlib implementations reside as far as efficiency and optimality of implementation are concerned: http://www.codeproject.com/KB/recipes/Tokenizer.aspx — , Jan 19 '11 at 04:20

Jonathan Grynspan · Accepted Answer · 2011-01-18T21:31:37.543

First off, do it right, then do it fast--only optimize if you can see for certain that a piece of code is not performant.

snprintf() into a buffer will do what you want. Is it the fastest possible solution? Not at all. But it is among the simplest, and it will suffice to get your code into a working state. From there, if you see that those calls to snprintf() are so laborious that they need to be optimized, then and only then seek out a faster solution.

score 2 · Answer 2 · answered Jan 18 '11 at 21:34

2

An array of strings such that

array[25] = "25";
array[26] = "26";

array[255] = "255";

maybe? You could write a small program that generates the table source code for you quite easily, and then use this file in your project.

Edit: I don't get what you mean by you don't want to ge strings involved.

answered Jan 18 '11 at 21:34

James

9,064
3
31
49

I think by strings they mean "std::string"s. I was confused at that too. – gravitron Jan 18 '11 at 21:38

Nim · Answer 3 · 2011-01-19T08:35:05.460

2

try this:

int convert(unsigned short val, char* dest)
{
  int i = 0;
  if (val > 10000)
  {
    dest[i++] = (val / 10000) | 0x30;
    val %= 10000;
  }
  if (val > 1000)
  {
    dest[i++] = (val / 1000) | 0x30;
    val %= 1000;
  }
  if (val > 100)
  {
    dest[i++] = (val / 100) | 0x30;
    val %= 100;
  }
  if (val > 10)
  {
    dest[i++] = (val / 10) | 0x30;
    val %= 10;
  }
  dest[i++] = (val) | 0x30;
  dest[i] = 0;
  return i;
}

edited Jan 19 '11 at 08:35

answered Jan 18 '11 at 22:48

Nim

33,299
2
62
101

1

perhaps a final `dest[i] = '\0';` ? – Tony Delroy Jan 19 '11 at 01:36

score 1 · Answer 4 · answered Jan 18 '11 at 21:29

I would say at least try sprintf and since you have this tagged as C++, try StringStream, and actually profile them. In many cases the compiler is smart enough to build something that works pretty well. Only when you know it's going to be a bottleneck do you need to actually find a faster way.

Fred Nurk · Answer 5 · 2011-01-19T04:25:52.280

I hacked together a test of various functions here, and this is what I came up with:

write_ushort: 7.81 s
uShortToStr: 8.16 s
convert: 6.71 s
use_sprintf: 49.66 s

(Write_ushort is my version, which I tried to write as clearly as possible, rather than micro-optimize, to format into a given character buffer; use_sprintf is the obvious sprintf(buf, "%d", x) and nothing else; the other two are taken from other answers here.)

This is a pretty amazing difference between them, isn't it? Who would ever think to use sprintf faced with almost an order of magnitude difference? Oh, yeah, how many times did I iterate each tested function?

// Taken directly from my hacked up test, but should be clear.
// Compiled with gcc 4.4.3 and -O2.  This test is interesting, but not authoritative.
int main() {
  using namespace std;
  char buf[100];

#define G2(NAME,STMT) \
  { \
    clock_t begin = clock(); \
    for (int count = 0; count < 3000; ++count) { \
      for (unsigned x = 0; x <= USHRT_MAX; ++x) { \
        NAME(x, buf, sizeof buf); \
      } \
    } \
    clock_t end = clock(); \
    STMT \
  }
#define G(NAME) G2(NAME,) G2(NAME,cout << #NAME ": " << double(end - begin) / CLOCKS_PER_SEC << " s\n";)
  G(write_ushort)
  G(uShortToStr)
  G(convert)
  G(use_sprintf)
#undef G
#undef G2

  return 0;
}

Sprintf converted the entire possible range of unsigned shorts, then did the whole range again 2,999 more times at about 0.25 µs per conversion, on average, on my ~5 year old laptop.

Sprintf is portable; is it also efficient enough for your requirements?

My version:

// Returns number of non-null bytes written, or would be written.
// If ret is null, does not write anything; otherwise retlen is the length of
// ret, and must include space for the number plus a terminating null.
int write_ushort(unsigned short x, char *ret, int retlen) {
  assert(!ret || retlen >= 1);

  char s[uint_width_10<USHRT_MAX>::value];  // easy implementation agnosticism
  char *n = s;
  if (x == 0) {
    *n++ = '0';
  }
  else while (x != 0) {
    *n++ = '0' + x % 10;
    x /= 10;
  }

  int const digits = n - s;
  if (ret) {
    // not needed by checking retlen and only writing to available space
    //assert(retlen >= digits + 1);

    while (--retlen && n != s) {
      *ret++ = *--n;
    }
    *ret = '\0';
  }
  return digits;
}

Compile-time log TMP functions are nothing new, but including this complete example because it's what I used:

template<unsigned N>
struct uint_width_10_nonzero {
  enum { value = uint_width_10_nonzero<N/10>::value + 1 };
};
template<>
struct uint_width_10_nonzero<0> {
  enum { value = 0 };
};
template<unsigned N>
struct uint_width_10 {
  enum { value = uint_width_10_nonzero<N>::value };
};
template<>
struct uint_width_10<0> {
  enum { value = 1 };
};

Actually the difference is not that amazing, sprintf has a wide variety of format specifiers ... possible existence of whose it has to take into account in its algorithm. I have never seen sprintf or even some slower beast like boost::format taking significant percentage of running time in profile so on most of the cases it is not worth optimizing. — Öö Tiib, Jan 19 '11 at 18:14
A little angel told me not to use sarcasm in this post and that someone would comment on it. I didn't listen. It only appears like a huge difference because the functions are executed nearly 200 million times (USHRT_MAX, here 65,535, * 3,000). @ÖöTiib: You have restated the conclusion that I showed the OP without using a profiler, since I don't have access to the OP's code to profile it. — Fred Nurk, Jan 19 '11 at 23:39
Yes, but i also provided solution, because ... who knows. Can not be sarcastic if you do not see the real issue. For some reason text-based interfaces and protocols are gone popular and if two modules talk with each other using sprintf heavily to tell millions of unsigned shorts to each other then it may become bottle-neck. I would of course prefer to switch to binary interface/protocol on such case but reconsidering interfaces may be expensive/out of question for some other reasons. — Öö Tiib, Jan 21 '11 at 02:53

Efficiently convert an unsigned short to a char*

5 Answers5