1

We know that using float or double is not the option where correct precision is necessary, and we know BigDecimal serves that purpose but we also know that it's about 100 times slower than regular primitive operations.

Now what do we do if speed is critical for us and we do really need precision?

I tried to store value of currency in slowest unit and store its conversion like 1 BTC = 100000000 satoshi but after few experiments its clear that you simply won't be able to store 100BTC in long, exceeds max possible value. Yes there is an option to sacrifice precision like storing microBTC and so on, but the problem is more global, how we design such thing with primitives?

Peter O.
  • 32,158
  • 14
  • 82
  • 96
vach
  • 10,571
  • 12
  • 68
  • 106
  • 2
    I've not found `BigDecimal` vs. primitives to ever be a bottleneck. Are you sure it's too slow for your needs? – Kon Feb 03 '15 at 16:58
  • yes, simple micro benchmark shows that, besides thats obvious that you're creating immutable object every single time, it just cannot be a bottleneck when you need speed... – vach Feb 03 '15 at 16:59
  • i can find an article where expert java HFT developer states that BigDecimal sucks like 100 times in comparison to primitive types – vach Feb 03 '15 at 17:00
  • 1
    If you need arbitrary precision use a `BigDecimal` (or a `String`). No Java primitive type does what you want, and you can't create a new type without using an `Object`. – Elliott Frisch Feb 03 '15 at 17:01
  • i wonder how we keep and operate with monetary balances then? maybe store 2 values? like long firstpart "decimal point" long second part and implement basic operations... – vach Feb 03 '15 at 17:01
  • 1
    A hundred times a very small value does not make a bottleneck. Primitive ops are on the scale of a single nanosecond. Microbenchmarks have nothing to do with identifying bottlenecks. – Marko Topolnik Feb 03 '15 at 17:01
  • @MarkoTopolnik you'd be right if math operations were just small percent of the total flows latency, but if its like 50% then 100x times diff makes 50 times difference :( – vach Feb 03 '15 at 17:03
  • There are a number of respected SO members who work in that field, so you may get that wish fullfilled. – Marko Topolnik Feb 03 '15 at 17:03
  • @Vach You may want to put a bounty on this question to lure some experts. I'd be curious to hear what they say as well. – Kon Feb 03 '15 at 17:04
  • Well yes, that is the data point you need to provide in the question, we don't need to assume that. – Marko Topolnik Feb 03 '15 at 17:04
  • http://stackoverflow.com/help/bounty – Kon Feb 03 '15 at 17:06
  • Your question will be eligible for bounty in two days or so. See the hint somewhere below the question. – Marko Topolnik Feb 03 '15 at 17:06
  • i was afraid of that, i'll have to waid 2 days before putting a bounty :( thanks anyway – vach Feb 03 '15 at 17:06
  • 3
    Java's long type's maximum value (2^63)-1. There will eventually be just 21m bitcoins in circulation. With 1 BTC = 10^8 satoshis, this means 2.1*10^15 satoshis in total. With 2.1*10^15 < 2^51 << 21^63 you should be fine with long ... – Sirko Feb 03 '15 at 17:07
  • @Sirko maybe i've calculated something wrong but... oh i think i've made a mistake, my IDEA shows "integer value is to big" and i tought this is a compilation error... – vach Feb 03 '15 at 17:10
  • I dont understand whats wrong here. thatnks very stupid of me not noticeing this... – vach Feb 03 '15 at 17:13
  • 3
    Study D.E.Knuth's vol 2. You can implement a fixed point arithmetic quite efficiently using the "seminumerical algorithms". I've been through that, and I have made a workalike of BigInteger without recreating an object for each new op result. Yes, it means work, but if you really think you need it... Not sure what algos you plan, but BigDecimal isn't sooo slow. – laune Feb 03 '15 at 17:17
  • @laune thanks i'll look on that, D.Knuth is a megamind :) – vach Feb 03 '15 at 17:18
  • @laune you can put your comment as an answer, as it really does answer my question. – vach Feb 03 '15 at 17:27
  • @Vach Done++. Glad you found this helpful. – laune Feb 03 '15 at 17:36
  • 1
    @Sirko i think you'll have an answer too, long definitely will work, my confusion was caused because of this :) http://stackoverflow.com/questions/28305327/intellij-long-integer-value-is-too-big-but-in-range-of-long-maxvalue – vach Feb 03 '15 at 18:12
  • Use floating-point math if you want floating-point math. – tmyklebu Feb 03 '15 at 19:24
  • 1
    @tmyklebu Don't use floating-point math if you don't want floating-point math. – Daniel Dinnyes Feb 04 '15 at 10:46
  • @DanielDinnyes: Shrug. OP lays out the case for using floating-point math in his application. No reason not to do it. – tmyklebu Feb 04 '15 at 14:39

2 Answers2

5

As D.E.Knuth's has amply documented in his vol 2 of "The Art of Computer Programming" implementing arithmetic with arbitrary precision isn't "black art" - see the chapter "Seminumerical Algorithms". I've been through that, following the necessities of implementing COBOL computationals (so: not only integers).

Also, making a workalike of BigInteger or BigDecimal without recreating an object for each new op result is an option. Yes, it means work, but if you really think you need it...

I have found that BigInteger isn't so slow as far as the arithmetic is concerned. What really killed using that for me was the necessity to convert between binary and decimal (array of decimal digits) frequently.

laune
  • 31,114
  • 3
  • 29
  • 42
  • *Seminumerical Algorithms* is the title of the whole volume, not one chapter. The chapter you refer to is entitled *Arithmetic.* – user207421 Jan 29 '16 at 03:13
1

Your options are:

BigDecimal - Accurate and effective - slower than primitives but probably no measurably so - mostly immutable (i.e. can change accuracy but not value).

long - Accurate but has value limits - primitive so cannot be bested for speed - immutable - maths is easier and clearer to write.

BigInteger - Probably your best half-way house between the above - immutable so you have to make new ones whenever you change the value - you are unlikely to hit it's limits.

OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • 2
    `long` is just as mutable as `BigInteger`, really. – Marko Topolnik Feb 03 '15 at 17:07
  • i dont have to worry about mutabillity as those values will not be operated in multithreaded fasion... – vach Feb 03 '15 at 17:09
  • @MarkoTopolnik - Yes - in a purist sense - but there are many functional differences between `long c = b + 1` and `BigInteger c = b.plus(BigInteger.ONE)`. Perhaps mutable was the wrong word - maybe I meant more `concise maths`. – OldCurmudgeon Feb 03 '15 at 17:10
  • @Vach - for scientific calculations using mutables can make a significant benefit as you are spending most of your time in the maths. When dealing with monetary values most of your time is spent working out what to do next. The actual maths is rarely a bottleneck so mutability has hardly any benefit. – OldCurmudgeon Feb 03 '15 at 17:14
  • @OldCurmudgeon in my case one cycle of calculations will create hell a lot of BigDecimal objects, and even if i dont consider the speed, they will fulfill my heap, and at some piont i'll jsut have GC operation which is not acceptable, as i need it to run like once a day, or very rarely at least... – vach Feb 03 '15 at 17:16
  • 1
    @Vach - You'd be surprised how efficient the JVM is in re-using the objects you let go. Stop worrying about it. If there is a **measurable** speed issue, deal with it later. Take a look [here](http://programmers.stackexchange.com/questions/149563/should-we-avoid-object-creation-in-java). – OldCurmudgeon Feb 03 '15 at 17:19
  • Thanks, i guess my benchmarks are not that correct, will have to use oracle JHM or what its called... – vach Feb 03 '15 at 17:24
  • @OldCurmudgeon yeah i know about how java rocks when it comes to object creation, thats the main reason it sometimes rocks even C code, because C uses malloc every time which is expensive... – vach Feb 03 '15 at 17:26
  • 1
    @vach There's a lot more you need to know about HotSpot's memory management---such as *Escape Analysis* and the prospect of not allocating on the heap at all, and the huge difference between regular Minor GCs and the Major GC. – Marko Topolnik Feb 03 '15 at 18:37
  • @MarkoTopolnik thanks i was just reading about that, perhaps i'm to overoptimizing my task, but regarding your comment. I cannot set hotspot not to GC at all, because obviously lot of garbage will collect over time, the trick is not to collect to much garbage before your GC happens (which might be once in a day...) thus i dont want to create lot of objects (because i know gc will work very rarely) – vach Feb 03 '15 at 18:40
  • i found this http://vanillajava.blogspot.com/ it is very good, the guy who runs the blog is HFT expert :) – vach Feb 03 '15 at 18:41
  • 1
    That's our Peter Lawrey, one of the top guys around here. About GC, you are wrong that it is imperative it not run---it is imperative that a *Major GC* not run; nobody ever worries about minor ones. The frequency of Major GCs has very littile to do with the *amount* of garbage you create---it has to do with the *lifespan* of objects. The only thing to avoid is retaining a huge amount of objects at once and your use case does not indicate that risk at all. – Marko Topolnik Feb 03 '15 at 19:15