Is there a speed difference in ordering by int vs. float?

Question

When retrieving entries in a database, is there a difference between storing values as a float or decimal vs. an int when using ORDERBY in a SELECT statement?

Bohemian · Answer 1 · 2011-09-09T22:37:35.520

8

(Edited) Since both int and float occupy exactly the same space on disk, and of course in memory - ie 32 bits - the only differences are in the way they are processed.

int should be faster to sort than float, because the comparison is simpler: Processors can compare ints in one machine cycle, but a float's bits have to be "interpreted" to get a value before comparing (not sure how many cycles, but probably more than one, although some CPUs may have special support for float comparison).

edited Sep 09 '11 at 22:37

answered Sep 09 '11 at 11:28

Bohemian

412,405
93
575
722

1

Good point about the actual CPU cost, I was focused on disk and memory – billinkc Sep 09 '11 at 11:38
I would agree, but for decimals, it should not be true: they're coded somehow like old EBCEDIC and should compare quickly in binary mode. – iDevlop Sep 09 '11 at 12:36
Looks like it totally depends on the processor implementation : some have floating point units that makes them very fast. However +1 because ints are still guaranteed to be fast in all cases. – Klaim Sep 09 '11 at 12:47
1

While what you say is correct, it is also blatantly incomplete, in the sense that you fail to point out how irrelevant a few extra CPU cycles are, in most cases. IOW : you fail to point out that for a few extra CPU cycles to make a _noticeable_ difference, you need millions and millions of comparisons, iow thousands and thousands of rows to be compared, and if you're in _that_ scenario then whatever "noticeable" difference you might get from CPU consumption will typically still be almost nothing, compared to the typical time needed for disk IO for such volumes. – Erwin Smout Sep 09 '11 at 14:53
1

@Erwin Smou - sorry I didn't spell it out for you that the difference I pointed out was the *only* difference, since both types occupy the exact same space on disk and in memory - 32 bits. I thought I answered the question well. I have edited the answer the explicitly state what is common knowledge. – Bohemian Sep 09 '11 at 22:32

billinkc · Accepted Answer · 2011-09-09T12:36:52.993

It depends. You didn't specify the RDBMS so I can only speak to SQL Server specifically but data types have different storage costs associated with them. Ints range from 1 to 8 bytes, Decimals are 5-17 and floats are 4 to 8 bytes.

The RDBMS will need to read data pages off disk to find your data (worst case) and they can only fit so many rows on an 8k page of data. So, if you have 17 byte decimals, you're going to get 1/17th the amount of rows read off disk per read than you could have if you sized your data correctly and used a tinyint with a 1 byte cost to store X.

That storage cost will have a cascading effect when you go to sort (order by) your data. It will attempt to sort in memory but if you have a bazillion rows and are starved for memory it may dump to temp storage for the sort and you're paying that cost over and over.

Indexes may help as the data can be stored in a sorted manner but again, if getting that data into memory may not be as efficient for obese data types.

[edit]

@Bohemian makes a fine point about the CPU efficiency of integer vs floating point comparisons but it is amazingly rare for the CPU to be spiked on a database server. You are far more likely to be constrained by the disk IO subsystem and memory which is why my answer focuses on the speed difference between getting that data into the engine for it to perform the sort operation vs the CPU cost of comparison.

Thanks, @billinkc. The database will most likely be PostgreSQL. I would probably need a *bigint* so I have more flexibility when saving records to be sorted later in a particular order. — Matt Norris, Sep 12 '11 at 04:01

score 2 · Answer 3 · answered Sep 09 '11 at 12:42

In general, the choice of datatypes should be driven by whether the datatype is appropriate for storing the values that are required to be stored. If a given datatype is inadequate, it doesn't matter how efficient it is.

In terms of disk i/o the speed difference is second order. Don't worry about second order effects until your design is good with regard to first order effects.

Correct index design will result in a huge decrease in delays when a query can be retrieved in sorted order to begin with. However, speeding up that query is done at the cost of slowing down other processes, like processes that modify the indexed data. The trade off has to be considered to see whether it's worth it.

In short, worry about the stuff that's going to double your disk i/o or worse before you worry about the stuff that's going to add 10% to your disk i/o

Good point about the design, Walter. The true reason I'm asking the question is because I want to construct a relative timeline http://stackoverflow.com/questions/7342264 in the database and am wondering if I should just use a bigint or float to sort events. After all, if I space the numbers apart when creating the records on an int, I can insert a new event in between 2 old events by taking the number halfway in between. I could also use a decimal, since I could always just add a new place at the end to insert in between (.50 --> .501 --> .51). — Matt Norris, Sep 12 '11 at 03:56

Is there a speed difference in ordering by int vs. float?

3 Answers3

Linked