2

Is there a well-known algorithm for converting an arbitrary number (either float or integer) into a string representation in such a way that the sort order is maintained?

To illustrate what I mean: if my numbers would be limited to be integers in the range [0,9999], it's very simple: I could just pad all numbers to have four digits:

1000
200
30
4

=>

0004
0030
0200
1000

But how do I do this for arbitrary numbers, including integers, floats and negatives?

My usecase is that I want to be able to store numbers as string in a database, and still be able to sort on them.

wvdz
  • 16,251
  • 4
  • 53
  • 90
  • What programming language are you using? – Tim Biegeleisen Nov 02 '19 at 15:24
  • @TimBiegeleisen I'm looking for a generic solution, not one that is limited to a certain programming language. To clarify, I added my usecase to the bottom of the question. – wvdz Nov 02 '19 at 15:27
  • This isn't really an algorithm question IMHO. Every programming language/database would have a slightly different way of padding your numbers to fixed width. Honestly, if you want the data to sort as numbers in a database table, you might want to maintain _two_ columns, one the original numeric data, and the other perhaps a computed column which renders the numbers as text. – Tim Biegeleisen Nov 02 '19 at 15:30
  • 1
    The real question in my mind is why you'd want to store numbers as strings in a database. That is a decidedly odd thing to do, and introduces all kinds of problems. Can you tell us why you think you need to do this? – Jim Mischel Nov 02 '19 at 16:01
  • @JimMischel I'm trying out a single table design where it would help me a lot if I could reuse the same column for all datatypes. – wvdz Nov 02 '19 at 16:40
  • @TimBiegeleisen so maybe my question should have been something like: how do programming languages/databases internally represent floating point numbers. There's gotta be a more or less standardized way to do this. I don't think they will all reinvent the wheel. – wvdz Nov 02 '19 at 16:42
  • Internally, most machines these days will use [IEEE-754](https://en.wikipedia.org/wiki/IEEE_754) to represent floating-point numbers. And most languages will use the native representation. SQL Server also uses that format. I don't know about any other databases, but it's probably a good bet that they use IEEE-754, as well. – Jim Mischel Nov 02 '19 at 19:27
  • I'm no database expert, but I'm pretty sure that trying to force your data type to fit the database schema is a really bad idea. The normal method is to build a schema that fits the data. – Jim Mischel Nov 02 '19 at 19:30
  • @wvdz Databases don't store floating point numbers with padding. – Tim Biegeleisen Nov 03 '19 at 00:06
  • It seems I found my answer here: https://stackoverflow.com/questions/43299299/sorting-floating-point-values-using-their-byte-representation I can just use the IEEE-754 binary representation and do some flipping logic based on whether the number is positive or negative. Thanks for your comments, they helped me look in the right direction. – wvdz Nov 03 '19 at 09:44

0 Answers0