0

I couldn't really come up with a proper title for my question but allow me to present my case; I want to calculate a significance ratio in the form: p = 1 - X / Y

Here X comes from an iterative process; the process takes a large number of steps and counts how many different ways the process can end up in different states (stored in a HashMap). Once the iteration is over, I select a number of states and sum their values. It's hard to tell how large these numbers are so I am intending to implement the sum as BigInteger.

Y, on the other hand comes from a binomial coefficient with numbers in thousands-scale. I am inclined to use logGamma to calculate these coefficients, which as a result give me the natural logarithm of the value.

What I am interested in is to do division X / Y in the best/most effective way. If I can get X in the natural logarithm then I could subtract the powers and have my result as 1 - e ^ (lnX - lnY).

I see that BigInteger can't be logarithmized by Math.log, what can I do in this case?

posdef
  • 6,498
  • 11
  • 46
  • 94

3 Answers3

2

You may be able to use doubles. A double can be extremely large, about 1.7e308. What it lacks is precision: it only supports about 15 digits. But if you can live with 15 digits of precision (in other words, if you don't care about the difference between 1,000,000,000,000,000 and 1,000,000,000,000,001) then doubles might get you close enough.

Willis Blackburn
  • 8,068
  • 19
  • 36
  • interesting, I didn't know that doubles could hold that large numbers, tho I have to ask, what's the point of holding 300+ digits when you can't really use anything other than the first 15? Well my problem is, as I mentioned more on the division, so I guess it shouldn't matter so much... I'll give it a try and see what happens :) – posdef Feb 20 '11 at 14:55
  • Just to be clear, what I mean is, add everything up using BigInteger, then convert to double for the purpose of generating the log. – Willis Blackburn Feb 20 '11 at 14:58
  • Regarding the question you asked, I guess the easy answer is, holding 300+ digits with 15 digits of precision is useful when you need to work with a number of that magnitude but don't care about having more than 15 digits of precision. :-) But to use a real example, the US national debt in cents is about 15 digits, and you can imagine that if your job involves tracking the debt, you probably don't care too much about tenths or hundredths of a cent. – Willis Blackburn Feb 20 '11 at 15:01
  • The thing is precision for doubles isn't constant. The largest possible error for fp math following the IEEE-754 standard is 0.5 ULPs (i.e. the best possible result - this is computationally quite complex so fast-math libs often increase that), but the relative factor is between: 0.5 * b^-p <= 0.5*ULP <= b/2 * b^-p. B (umn beta) beeing the base and p the precision. For details see "What every computer scientist should know about Floating-Point Arithmetic" by Ruth Goldberg. Google will find more than enough. – Voo Feb 20 '11 at 15:53
  • 1
    Ah and note that the standard java math library only guarantees the result to be withing 1-2 ulps, if accuracy is important StrictMath will follow the IEEE standard. – Voo Feb 20 '11 at 16:03
2

If you are calculating binomial coefficients on numbers in the thousands, then Doubles will not be good enough.

Instead I would be inclined to call the toString method on the number, and compute the log as log(10) * number.toString().length() + log(asFloat("0." + number.toString()) where asFloat takes a string representation of a number and converts it to a float.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • I am not sure I follow, could you elaborate, or alternatively supply a resource where I can read a bit on your suggestion? – posdef Feb 21 '11 at 08:42
  • Imagine that you had infinite precision math. Then an n-digit integer N can always be rewritten (N/10^n)*10^n. The first number is a floating point number of the form 0.(digits of N), which is something we can reasonably accurately represent inside of a Float and then take a log of, and the log of the other is 10*log(10). That is really my whole suggestion right there. – btilly Feb 21 '11 at 15:40
0

If you need maximum precision, how about converting the BigIntegers into BigDecimals and doing algebra on them. If precision isn't paramount, then perhaps you can convert your BigIntegers into doubles and do simple algebra with them. Perhaps you can tell us more about your problem domain and why you feel logarithms are the best way to go.

Hovercraft Full Of Eels
  • 283,665
  • 25
  • 256
  • 373
  • you mean division on `BigDecimal` s ? The potential problem with that is getting the binomial coefficient from log value, to a `BigDecimal` or a `BigInteger` for that matter. Feel like it'd be "smarter" to work on the logarithmic scale. I am not sure about what you mean by the problem domain, but as I mentioned in the question, I am working on a tool that looks for the possible outcomes of a process with many steps, and calculates the ratio. If you are after specific details, I'd try to clarify further... – posdef Feb 20 '11 at 15:06