0

What is the fastest and cross platform/compiler (GCC and MSVC) way to multiply 64b x 32b divide by 64b integers like:

uint64_t counter;
uint32_t resolution = NANOS_IN_SEC; // NANOS_IN_SEC = 1000000000
uint64_t freq;

uint64_t res = (counter * resolution) / freq; // but without overflow/losing precision

Result is guaranteed to always fit into 64b.

I checked a lot of answers, but all of them solved 64b x 64b multiplication and were quite slow.

Is there a solution how to reduce the code complexity when we can assume the 2nd operand to be 32b only?

mvorisek
  • 3,290
  • 2
  • 18
  • 53
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/218105/discussion-on-question-by-mvorisek-multiply-64b-x-32b-divide-by-64b-integers). – Samuel Liew Jul 18 '20 at 17:20

1 Answers1

0

I ended up with specific solution, it accepts frequency even above 32b.

static uint64_t counter_and_freq_to_nanotime(uint64_t counter, uint64_t freq)
{
    uint32_t div = 1, freq32;
    uint64_t q, r;

    while (freq >= (1ull << 32)) {
        freq /= 2;
        div *= 2;
    }
    freq32 = freq;

    q = counter / freq32;
    r = counter % freq32;
    return (q * NANOS_IN_SEC + (r * NANOS_IN_SEC) / freq32) * div;
}

Quick benchmark (E5-2699v4, Win7 x64):

  • MFllMulDiv: ~50 ns
  • this solution: ~1.5 ns
E_net4
  • 27,810
  • 13
  • 101
  • 139
mvorisek
  • 3,290
  • 2
  • 18
  • 53
  • What is `NANOS_IN_SEC` ? Without knowing it, a downvote is worth it. – Basile Starynkevitch Jul 18 '20 at 17:15
  • @BasileStarynkevitch Updated my question to reflect this exact constant, added also a short benchark showing the speed difference between the fastest MSVC native solution. – mvorisek Jul 18 '20 at 17:26
  • Yes, as I told you, you need to convince your manager or client to switch to a better C compiler, such as [GCC](http://gcc.gnu.org/) - which runs on Linux, Windows, MacOSX and has extensive cross-compilation facilities, while being very close to the [C11 standard](https://web.cs.dal.ca/~vlado/pl/C_Standard_2011-n1570.pdf). Your problem is not technical, but managerial. Being forced to use MSVC is not the rational way to get good performance. Otherwise, ask client to upgrade the hardware. The time you are spending is worth a lot more than the putative optimization gain – Basile Starynkevitch Jul 18 '20 at 19:59
  • What is `MFllMulDiv`? – President James K. Polk Jul 20 '20 at 20:50
  • Regarding the loop, see https://en.wikipedia.org/wiki/Find_first_set#CLZ. Since `div` is a power of 2, depending on how smart the compiler is, it may or may not be faster to do a left shift versus a multiply. – President James K. Polk Jul 20 '20 at 20:54