64-bit fixed-point multiplication error

Question

I'm implementing a 64-bit fixed-point signed 31.32 numeric type in C#, based on long. So far so good for addition and substraction. Multiplication however has an annoying case I'm trying to solve.

My current algorithm consist of splitting each operand into its most and least significant 32 bits, performing 4 multiplications into 4 longs and adding the relevant bits of these longs. Here it is in code:

public static Fix64 operator *(Fix64 x, Fix64 y) {

    var xl = x.m_rawValue; // underlying long of x
    var yl = y.m_rawValue; // underlying long of y

    var xlow = xl & 0x00000000FFFFFFFF; // take the 32 lowest bits of x
    var xhigh = xl >> 32; // take the 32 highest bits of x
    var ylow = yl & 0x00000000FFFFFFFF; // take the 32 lowest bits of y
    var yhigh = yl >> 32; // take the 32 highest bits of y

    // perform multiplications
    var lowlow = xlow * ylow;
    var lowhigh = xlow * yhigh;
    var highlow = xhigh * ylow;
    var highhigh = xhigh * yhigh;

    // take the highest bits of lowlow and the lowest of highhigh
    var loResult = lowlow >> 32;
    var midResult1 = lowhigh;
    var midResult2 = highlow;
    var hiResult = highhigh << 32;

    // add everything together and build result
    var finalResult = loResult + midResult1 + midResult2 + hiResult;
    return new Fix64(finalResult); // this constructor just copies the parameter into m_rawValue
}

This works in the general case but fails in a number of scenarios. Namely, the result is off by 1.0 (decimal value), often for extremely small or large values of the operands. Here are some results from my unit tests (FromRaw() is a method that builds a Fix64 directly from a long value, without shifting it):

Failed for FromRaw(-1) * FromRaw(-1): expected 0 but got -1
Failed for FromRaw(-4) * FromRaw(6791302811978701836): expected -1.4726290525868535041809082031 but got -2,4726290525868535041809082031
Failed for FromRaw(2265950765) * FromRaw(17179869183): expected 2.1103311001788824796676635742 but got 1,1103311001788824796676635742

I'm trying to work out the logic of this on paper but I'm a bit stuck. How can I fix this?

What are you doing with the carry bits? Also, I'm not quite understanding the translation.. what is the equivalent numeric value of `2265950765`, say? — mellamokb, Dec 24 '12 at 22:17
I'm not familiar with C#'s integer promotion rules — are the values `lowlow` etc. 32-bit, or does 32x32 multiplication automatically give a 64-bit result? — hobbs, Dec 24 '12 at 22:29
@hobbs: `xlow` and `ylow` are already longs, so the product of the two will also be a long. If you multiply two 32-bit integers in C#, you will get a 32-bit result with overflow. — mellamokb, Dec 24 '12 at 22:32
@mellamokb xlow, ylow, xhigh and yhigh are 64-bit integers where the first 32 bits are 0s (for -low) or 1s (for -high) and the last are the low or high 32 bits of x and y. So the carry bits just spill into the unused high 32 bits. The conversion from long is just (value * (1L << 32)). The conversion back to long is m_rawValue >> 32. Basically the same rule applies for floating point types, with some rounding and casts. — Asik, Dec 25 '12 at 06:00

hobbs · Accepted Answer · 2012-12-24T23:05:24.313

The algorithm looks sound, and it worked it out "on paper" and it seems right. Here are my worked out notes for FromRaw(2265950765) * FromRaw(17179869183) (0.52758277510292828083038330078125 * 3.99999999976716935634613037109375 = 2.11033110017888247966766357421875)

x1 = 2265950765
y1 = 17179869183

xlow = 2265950765
xhigh = 0
ylow = 4294967295
yhigh = 3

lowlow = 9732184427755230675
lowhigh = 6797852295
highlow = 0
highhigh = 0

loResult = 2265950764
midResult1 = 6797852295
midResult2 = 0
hiResult = 0

finalResult = 9063803059

Now here's what I suspect is happening: lowlow needs to be a ulong for the result to come out right, but I think that what you're getting is a signed value. Interpreted as signed, lowlow ends up being -8714559645954320941 (too low by 2^64), loResult ends up being -2029016532 (too low by 2^32), finalResult ends up being 4768835763 (also too low by 2^32), and the resulting value is then 1.11033110017888247966766357421875 which is exactly 1 less than you expect.

In general your values should be treated as having a signed "upper half" and an unsigned "lower half". highhigh is signed * signed = signed; lowhigh and highlow are signed * unsigned = signed; but lowlow is unsigned * unsigned = unsigned.

@StephenCanon ha! Fantastic to see you here. I don't know if you recall, but you helped me with a very similar problem a few years ago, which is why I understand it now :) — hobbs, Dec 24 '12 at 23:10
If xlow and ylow are unsigned, I have to insert a cast to do xlow * yhigh and xhigh * ylow, because C# won't let me multiply signed and unsigned longs together. Should I cast the high values to unsigned and then the result back to signed? This confuses me. — Asik, Dec 25 '12 at 05:20
@Dr_Asik: They need to be cast to `uint`, not `ulong`. So you'd want `high1 * (long)(uint)low2 + high2 * (long)(uint)low1` — Ben Voigt, Jan 02 '13 at 05:03

score 0 · Answer 2 · edited Dec 24 '12 at 22:21

0

I don't understand, why FromRaw(-1) * FromRaw(-1) should return 0? It should return +1

Generally about algorithm: don't split, just multiply longs.

Suppose you multiply 2.3*4.5. You will get 10.35. But if you multiply 23*45, you will get 1035.

The figures are the same!

So, to multiply your numbers, you should multiply m_rawValues and then shift bits right.

edited Dec 24 '12 at 22:21

mellamokb

56,094
12
110
136

answered Dec 24 '12 at 22:19

Dims

47,675
117
331
600

2

no, the cross-multiply algorithm is correct. If you "multiply and then shift" you overflow before you shift, and the results are incorrect. To get a 64x64=64 multiply, you need to split down to four 32x32=64 multiplies. – hobbs Dec 24 '12 at 22:50
The raw value -1 represents -0.00...0024 something, the smallest negative value that my type can represent. So if you multiply it by itself, it gives a result too small to be represented and thus should be rounded to 0. – Asik Dec 25 '12 at 05:13

64-bit fixed-point multiplication error

2 Answers2

Linked