0

I need to convert a set of strings similar to /azurite/spot00 to integers in order to use in ML libraries. Hand-rolling an enumerating algorithm (assign i++ to each next label) sounds easy enough. But nowhere nearly as elegant as a bidirectional hash between std::string and int (not sure if I need int64 or something else).

std::hash doesn't seem to state it's reversible. Anything in the standard library?

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
Vorac
  • 8,726
  • 11
  • 58
  • 101

1 Answers1

3

There's no general-purpose way to find a bijection from std::string to int for the simple but mundane reason that there are more possible std::strings than there are ints. (Specifically, there's effectively an unbounded number of possible std::strings, and there are only 232 or 264 distinct possible integers).

There are ways to construct perfect hash functions from strings to integers if you have a fixed set of strings you want to work with, but in your case if the goal is just to label all the strings with distinct values your initial idea of just having a counter and assigning each string the next available number is probably just fine.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065