I have several number-crunching operations that account for a good portion of the CPU time. One example of such operations is this function:
import Data.Number.Erf
import Math.Gamma
import Math.GaussianQuadratureIntegration as GQI
-- Kummer's' "1F1" a.k.a M(a,b,z) Confluent Hypergeometric function
-- Approximation by the Gaussian Quadrature method from 128 up to 1024 points of resolution
kummer :: Double -> Double -> Double -> Double -> Double
kummer a b z err = gammaFactor * integralPart
where
gammaFactor = (gamma b) / (gamma a * gamma (b-a))
integralPart = (integrator err) fun 0 1
fun = (\t -> (e ** (z * t)) * (1-t) ** (b-a-1) * t ** (a-1))
e = exp 1
integrator err
| err > 0.1 = GQI.nIntegrate128
| err > 0.01 = GQI.nIntegrate256
| err > 0.001 = GQI.nIntegrate512
| otherwise = GQI.nIntegrate1024
SO, I was wondering if there are some rules to follow about when a function should be INLINE to improve performance. REPA Authors suggest to:
Add INLINE pragmas to all leaf-functions in your code, especially ones that compute numeric results. Non-inlined lazy function calls can cost upwards of 50 cycles each, while each numeric operator only costs one (or less). Inlining leaf functions also ensures they are specialized at the appropriate numeric types.
Are these indications also applicable to the rest of the numerical calculations or only to array computations? or is there a more general guide to decide when a function should be inline?
Notice that this post: Is there any reason not to use the INLINABLE pragma for a function? does not address directly the question about if the hints provided by the programmer truly help the compiler to optimize the code.