0

As division operation (/) is expensive in case of FPGA ? Is it possible to perform division of two Q15 format numbers(16 bit fixed point number) with basic shift operations?

Could someone help me by providing some example?

Thanks in advance!

  • 2
    What's Q15? Please provide an example, possibly some code you tried. –  Jan 02 '17 at 23:08
  • I'm assuming for the OP that Q15is a 16-bit fixed point number. https://en.wikipedia.org/wiki/Q_%28number_format%29 – wilcroft Jan 03 '17 at 21:34
  • yes it is 16-bit fixed point number. – kotgire mohan Jan 04 '17 at 00:05
  • @kotgiremohan In my understanding Q15 is a purely fractional fixed-point format capable of representing numbers in [-1, 1). It seems to me that in many instances, the result of division would not be representable as a Q15 number (whenever magnitude of divisor is smaller than magnitude of dividend, the mathematical result has magnitude greater than 1, causing overflow). That said, look for info on restoring or non-restoring binary division: these are based on shifts, adds, and subtracts. Reading Wikipedia for those will already provide a decent start, much more can be found by internet search. – njuffa Jan 04 '17 at 18:35

1 Answers1

0

Fixed-point arithmetic is just integer arithmetic with a bit of scaling thrown in. Q15 is a purely fractional format stored as a signed 16-bit integer with scale factor of 215, able to represent values in the interval [-1, 1). Clearly, division only makes sense in Q15 when the divisor's magnitude exceeds the dividend's magnitude, as otherwise the quotient's magnitude exceeds the representable range.

Before embarking on a custom Verilog implementation of fixed-point division, you would want to check your FPGA vendor's library offerings as a fixed-point library including pipeline division is often available. There are also opens source projects that may be relevant, such as this one.

When using integer division operators for fixed-point division, we need to adjust for the fact that the division will remove the scale factor, i.e (a * 2scale) / (b * 2scale) = (a/b), while the correct fixed-point result is (a/b * 2scale). This is easily fixed by pre-multiplying the dividend by 2scale, as in the following C implementation:

int16_t div_q15 (int16_t dividend, int16_t divisor)
{
     return (int16_t)(((int32_t)dividend << 15) / (int32_t)divisor);
}

Wikipedia gives a reasonable overwiew on how to implement binary division on a bit-by-bit basis using add, subtract, and shift operations. These methods are closely related to the longhand division taught in grade school. For FPGAs, the use of the non-restoring method if often preferred, as pointed out by this paper, for example:

Nikolay Sorokin, "Implementation of high-speed fixed-point dividers on FPGA". Journal of Computer Science & Technology, Vol. 6, No. 1, April 2006, pp. 8-11.

Here is C code that shows how the non-restoring method may be used for the division of 16-bit two's-complement operands:

/* bit-wise non-restoring two's complement division */
void int16_div (int16_t dividend, int16_t divisor, int16_t *quot, int16_t *rem)
{
    const int operand_bits = (int) (sizeof (int16_t) * CHAR_BIT);
    uint16_t d = (uint16_t)divisor;
    uint16_t nd = 0 - d; /* -divisor */
    uint16_t r, q = 0; /* remainder, quotient */
    uint32_t dd = (uint32_t)d << operand_bits; /* expanded divisor */
    uint32_t pp = dividend; /* partial remainder */
    int i;

    for (i = operand_bits - 1; i >= 0; i--) {
        if ((int32_t)(pp ^ dd) < 0) {
            q = (q << 1) + 0;   /* record quotient bit -1 (as 0) */
            pp = (pp << 1) + dd;
        } else {
            q = (q << 1) + 1;   /* record quotient bit +1 (as 1) */
            pp = (pp << 1) - dd;
        }
    }
    /* convert quotient from digit set {-1,1} to plain two's complement */
    q = (q << 1) + 1;

    /* remainder is upper half of partial remainder */
    r = (uint16_t)(pp >> operand_bits);

    /* fix up cases where we worked past a partial remainder of zero */
    if (r == d) { /* remainder equal to divisor */
        q = q + 1;
        r = 0;
    } else if (r == nd) { /* remainder equal to -divisor */
        q = q - 1;
        r = 0;
    }

    /* for truncating division, remainder must have same sign as dividend */
    if (r && ((int16_t)(dividend ^ r) < 0)) {
        if ((int16_t)q < 0) {
            q = q + 1;
            r = r - d;
        } else {
            q = q - 1;
            r = r + d;
        }
    }
    *quot = (int16_t)q;
    *rem  = (int16_t)r;
}

Note that there are multiple ways of dealing with the various special cases that arise in non-restoring division. For example, one frequently sees code that detects a zero partial remainder pp and exits the loop over the quotient bits early in this case. Here I assume that an FPGA implementation would unroll the loop completely to create a pipelined implementation, in which case early termination is not helpful. Instead, a final correction is applied to those quotients that are affected by ignoring a partial remainder of zero.

In order to create a Q15 division from the above, we have to make just a single change: incorporating the up-scaling of the dividend. Instead of:

uint32_t pp = dividend; /* partial remainder */

we now use this:

uint32_t pp = dividend << 15; /* partial remainder; incorporate Q15 scaling */

The resulting C code (sorry, I won't provide read-to-use Verilog code) including the test framework is:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <limits.h>
#include <math.h>

/* bit-wise non-restoring two's complement division */
void q15_div (int16_t dividend, int16_t divisor, int16_t *quot, int16_t *rem)
{
    const int operand_bits = (int) (sizeof (int16_t) * CHAR_BIT);
    uint16_t d = (uint16_t)divisor;
    uint16_t nd = 0 - d; /* -divisor */
    uint16_t r, q = 0; /* remainder, quotient */
    uint32_t dd = (uint32_t)d << operand_bits; /* expanded divisor */
    uint32_t pp = dividend << 15; /* partial remainder, incorporate Q15 scaling */
    int i;

    for (i = operand_bits - 1; i >= 0; i--) {
        if ((int32_t)(pp ^ dd) < 0) {
            q = (q << 1) + 0;   /* record quotient bit -1 (as 0) */
            pp = (pp << 1) + dd;
        } else {
            q = (q << 1) + 1;   /* record quotient bit +1 (as 1) */
            pp = (pp << 1) - dd;
        }
    }
    /* convert quotient from digit set {-1,1} to plain two's complement */
    q = (q << 1) + 1;

    /* remainder is upper half of partial remainder */
    r = (uint16_t)(pp >> operand_bits);

    /* fix up cases where we worked past a partial remainder of zero */
    if (r == d) { /* remainder equal to divisor */
        q = q + 1;
        r = 0;
    } else if (r == nd) { /* remainder equal to -divisor */
        q = q - 1;
        r = 0;
    }

    /* for truncating division, remainder must have same sign as dividend */
    if (r && ((int16_t)(dividend ^ r) < 0)) {
        if ((int16_t)q < 0) {
            q = q + 1;
            r = r - d;
        } else {
            q = q - 1;
            r = r + d;
        }
    }
    *quot = (int16_t)q;
    *rem  = (int16_t)r;
}

int main (void)
{
    uint16_t dividend, divisor, ref_q, res_q, res_r;
    double quot, fxscale = (1 << 15);

    dividend = 0;
    do {
        printf ("\r%04x", dividend);
        divisor = 1;
        do {
            quot = trunc (fxscale * (int16_t)dividend / (int16_t)divisor);
            /* Q15 can only represent numbers in [-1, 1) */
            if ((quot >= -1.0) && (quot < 1.0)) {
                ref_q = (int16_t)((((int32_t)(int16_t)dividend) << 15) / 
                                  ((int32_t)(int16_t)divisor));
                q15_div ((int16_t)dividend, (int16_t)divisor, 
                         (int16_t *)&res_q, (int16_t *)&res_r);
                if (res_q != ref_q) {
                    printf ("!r dividend=%04x (%f) divisor=%04x (%f)  res=%04x (%f)  ref=%04x (%f)\n", 
                            dividend, (int16_t)dividend / fxscale, 
                            divisor, (int16_t)divisor / fxscale, 
                            res_q, (int16_t)res_q / fxscale, 
                            ref_q, (int16_t)ref_q / fxscale);
                }
            }
            divisor++;
        } while (divisor);
        dividend++;
    } while (dividend);

    return EXIT_SUCCESS;
}
njuffa
  • 23,970
  • 4
  • 78
  • 130