2

I have an array of floating point numbers of type T, where T could be either float or double

T x[n];

These numbers are strictly positive and sorted, i.e.

0 < x[0] < x[1] < x[2] < ... < x[n-1]

I want to find the smallest floating point number H of type T such that the following inequality is strictly satisfied:

int(x[i+1]*H) > int(x[i]*H)   // for i=0..n-2

Let's assume that the numbers x are such that there are no underflow or overflow issues to worry about.

I am approaching the problem as described below, which I am not sure if it is robust and, when it works, I think produces a suboptimal result.

Is there a robust and accurate way to solve this? I could accept a suboptimal solution, but at least it needs to be robust.


My current method

I use the notation m(x) to indicate a floating point number which approximates x and may be affected by roundoff error. In the following passages, I modify the inequality below taking as appropriate upper and lower bounds. Note that the below is not to be interpreted as source code, but as the matehmatical steps and reasoning to get to the final equations.

Let a(x) be the closest floating point number toward -inf
Let b(x) be the closest floating point number toward +inf

// Original inequality, where the operation affected by rounding error
// are marked with m()
int(m(x[i+1]*H)) > int(m(x[i]*H))   // for i=0..n-2

// I remove `atoi` subtracting and adding 0.5 
// this is a conceptual operation, not actually executed on the machine,
// hence I do not mark it with an m()
m(x[i+1]*H) - 0.5 > m(x[i]*H) + 0.5   // for i=0..n-2

// I reorganize terms
m(x[i+1]*H) - m(x[i]*H) >  1.0        // for i=0..n-2

// I would like to resolve the m() operator with strict LHS minorant and RHS majorant
// Note I cannot use `a(x)` and `b(x)` because I do not know `H` 
// I use multiplication by `(1+2*eps)` or `(1-2*eps)` as I hope this 
// will result in a number strictly greater (or lower) than the original one. 
// I already know this does not work for example if x is very small.
// So this seems like a pitfall of this approach.
x[i+1]*H(1-2*eps) - x[i]*H*(1+2*eps) >  b(1)  // for i=0..n-2

// solve the inequality for H, marking the operations affected by rounding 
// with m()
H > m( b(1) / m( x[i+1](1-2*eps) - x[i]*(1+2*eps) ) ) // for i=0..n-2

// take majorant or minorant as appropriate for rounded terms
H > b( b(1) / a( x[i+1](1-2*eps) - x[i]*(1+2*eps) ) )   // for i=0..n-2

// and eventually take a majorant which gives me the final formula for H
H = b( b( b(1) / min{ a( x[i+1](1-2*eps) - x[i]*(1+2*eps) ) } ) )
Fabio
  • 2,105
  • 16
  • 26
  • What language is this? What is `atoi`? The C/C++ `atoi` function converts a *string* to an integer, so it can't be that. And what do you mean by `majorant` and `minorant`? The usual mathematical definitions don't seem to fit here. – Mark Dickinson Nov 03 '16 at 20:53
  • 1
    My bad. With atoi I meant truncation to integrer. I corrected the question. As per majorant and minorant, I mean majorant(a)>a and minorant(a) – Fabio Nov 04 '16 at 00:27
  • 1
    Thanks for the updates! Just to make sure I understand: for an example input array of `[0.99, 1.0]`, the optimal output would be `H = 1.0`, while your method produces something that's close to `H = 100.0`. Is that correct? This is an interesting problem ... – Mark Dickinson Nov 04 '16 at 08:41
  • Exactly! Furthermore, I am yet not convinced my method is robust. – Fabio Nov 04 '16 at 12:13

0 Answers0