Embedded C Cortex-M4: Properly handle division by big numbers?

Question

I'm writing embedded code for spectroscopy. In order to build my spectrum, I need to map linearly samples from an interval (dynamic range is given by the physics/specs of the problem) to another. Basically after data is processed I have a series of samples (peaks) and every one of them will contribute to the spectrum (i.e. will increment the counter of a specific bin in a histogram). Here's a sketch: So in C, I need to map each peak value into the [0:4095] and I'm doing this in real-time on a MCU (LPC4370) so I need to go fast. The problem is that my dumb implementation is squeezing everything to 0. here's what i did:

 #define MCA_SIZE     4096
 #define PEAK_MAX     1244672762
 #define PEAK_MIN     6000000

 int32_t mca[MCA_SIZE];
 int32_t peak_val;
 int32_t bin_val;

[...]

 if(peak_val > PEAK_MIN)
      {
       bin_val = (int)(MCA_SIZE*(peak_val-PEAK_MIN)/(PEAK_MAX-PEAK_MIN));

       /*Increment corrispondent multi channel bin*/
       mca[bin_val]+=1;
      };

Where every quantity is int32 if lower cas, #define is upper case. The problem is that is believe this one

(peak_val-PEAK_MIN)/(PEAK_MAX-PEAK_MIN)

Goes very often near zero. So I end-up having just the first one or two bins filled.

Here's a screenshot of first values of mca after few thousand iterations:

Here's the disassbly view of the code under study, along with register status at breakpoint.

What is the best/fastest way to handle this kind of problem?

Why are you using `mca[peak_val/1000]+=1;` after computing `bin_val`? I'd have expected something like `++mca[bin_val];`. — EOF, Apr 27 '19 at 14:44
You are doing it right. But you are looking at the wrong place for errors. Your calculation is done MCA_SIZE*(peak_val-PEAK_MIN) first and then /(PEAK_MAX-PEAK_MIN). That order is guaranteed by the standard and otherwise you would always get zero. If in doubt you could check the assembler code of it. (Btw. If you limit your peak value also limit the upper value.) — user6556709, Apr 27 '19 at 15:21
Is a logarithmic range out of the question? You appear to have an FPU at your disposal. Failing that, there are log2 functions which are reasonably fast on ALUs. — John McFarlane, Apr 28 '19 at 01:40
HI @JohnMcFarlane, I need to map linearly. Frome the [datasheet](https://studio.segger.com/packages/LPC4300/CMSIS/Documents/UM10503.pdf) _The ARM Cortex-M4 is implemented with a Memory Protection Unit supporting eight regions, a hardware Floating Point Unit (FPU), debugging features, and a system tick timer._ So FPU is there. — a_bet, Apr 28 '19 at 10:15
@JohnMcFarlane I took a look at the FPU registers (both single and double precision) and right after bin_val evaluation they are all empty (0). — a_bet, Apr 28 '19 at 12:32
I stepped through all the instruction calls, one-by-one, and the FPU registers are not touched. I believe I'm not using the FPU then:| — a_bet, Apr 28 '19 at 12:38
Maybe I'm misreading the problem, but shouldn't you be calculating this as `bin_val = (peak_val - PEAK_MIN) / ((PEAK_MAX - PEAK_MIN) / MCA_SIZE)` if you're doing this in integer math (though do note how this rounds). If you were trying to do this as a floating point calculation, your code likely lacks a cast to force the issue. — Hasturkun, Apr 28 '19 at 15:03

score 3 · Accepted Answer · answered Apr 28 '19 at 12:44

3

The intermediate result (MCA_SIZE*(peak_val-PEAKMIN)) is too large for a 32-bit integer datatype. I would use uint64_t for these calculations, and I would define all of your constants as const uint64_t rather than using a #define, adding a suffix of ULL to their literal values.

answered Apr 28 '19 at 12:44

Elliot Alderson

638
4
8

Thanks @Elliot Alderson, that was the problem. Even if the tools were not complaining at all I missed it. Do you think there are faster/better ways to do this? – a_bet Apr 28 '19 at 13:45
Given just the information that you have provided I don't know of a better way. The division will probably be the slowest part of the calculation, so if you could somehow arrange your data so that `PEAKMAX - PEAKMIN` was an integer power of 2 then you could substitute a right shift for the long int division. – Elliot Alderson Apr 28 '19 at 15:33
Yeah ok I was thinking about that. Or I will loose some precision on that division. – a_bet Apr 28 '19 at 17:29

izac89 · Answer 2 · 2019-04-28T03:24:45.673

1

Please note that your code is likely to produce signed integer overflow, which is undefined in the standard.

From the C99 standard (§3.4.3/1)

An example of undeﬁned behavior is the behavior on integer overﬂow

So I would start there, either move to unsigned, use a wider type or change the boundaries.

Also, as user6556709 mentioned in it's comment, the expression:

(MCA_SIZE*(peak_val-PEAK_MIN)/(PEAK_MAX-PEAK_MIN))

Is guaranteed to be executed as if it was written as follows, due to left-to-right associativity for those group of operators (note the parentheses):

((MCA_SIZE*(peak_val-PEAK_MIN))/(PEAK_MAX-PEAK_MIN))

So the always-zero-evaluated expression (peak_val-PEAK_MIN)/(PEAK_MAX-PEAK_MIN) is not performed, the expression (MCA_SIZE*(peak_val-PEAK_MIN)) is done prior, so that is not the main problem.

I would recommend providing some examples for peak_val in which bins are not filled.

edited Apr 28 '19 at 03:24

answered Apr 27 '19 at 18:08

izac89

3,790
7
30
46

Thanks @user2162550 for the reply. What do you mean by "not marked"? Do you mean bin_val != 0? – a_bet Apr 27 '19 at 19:24
I meant "not filled", not incremented. – izac89 Apr 27 '19 at 19:27
I'm not sure what you'd like me to post. I added aa screenshot from the debugging tools, hope it is on-point:) – a_bet Apr 27 '19 at 19:35
@user2162550 if overflow is occurring, how is unsigned going to help? – John McFarlane Apr 28 '19 at 01:02
@John McFarlane it will remove the undefined behaviour – izac89 Apr 28 '19 at 03:27
Declaring `uint32_t bin_val` and evaluating `bin_val = (uint32_t)(MCA_SIZE*(peak_val-PEAK_MIN)/(PEAK_MAX-PEAK_MIN));` gives me same results as before. Just bin 0 and 1 filled, all others are empty. – a_bet Apr 28 '19 at 12:07
@a_bet You need to **think** about this...you can fool the debugger as well as the runtime. What is the maximum possible value of `(MCA_SIZE*(peak_val-PEAKMIN))` and will it fit in a 32-bit integer datatype? – Elliot Alderson Apr 28 '19 at 12:39
@ElliotAlderson yes sorry, in some cases it can overflow. I'm now trying to scale down the fraction: `bin_val = (uint32_t)(MCA_SIZE*(peak_val >> 12-PEAK_MIN >> 12)/(PEAK_MAX >> 12 - PEAK_MIN >> 12));` – a_bet Apr 28 '19 at 12:57
@user2162550 yes, but how is that going to help? The result will still be wrong, but now you cannot employ a sanitizer to trap the error. – John McFarlane Apr 28 '19 at 22:30

Embedded C Cortex-M4: Properly handle division by big numbers?

2 Answers2