9

I was surprised by a timing difference that Daniel Lichtblau pointed out in two ways to get at the differences between the number of prime factors (PrimeOmega) and the number of distinct prime factors (PrimeNu) of an integer, n. So I decided to look into it a bit further.

The functions g1 and g2 below are slight variations of the ones Daniel used as well as three others. They all return the number of square-free integers from 1 through n. But the differences are fairly dramatic. Can anyone explain the reasons behind the differences. In particular, why does the Sum in g5 provide such a speed advantage?

g1[n_] := Count[PrimeOmega[Range[n]] - PrimeNu[Range[n]], 0]

g2[n_] := Count[With[{fax = FactorInteger[#]}, 
 Total[fax[[All, 2]]] - Length[fax]] & /@ Range[n], 0]

g3[n_] := Count[SquareFreeQ/@ Range[n], True]

(* g3[n_] := Count[SquareFreeQ[#] & /@ Range[n], True] Mr.Wizard's suggestion
   incorporated above. Better written but no significant increase in speed. *) 

g4[n_] := n - Count[MoebiusMu[Range[n]], 0]

g5[n_] := Sum[MoebiusMu[d]*Floor[(n - 1)/d^2], {d, 1, Sqrt[n - 1]}]  

The comparison:

n = 2^20;
Timing[g1[n]]
Timing[g2[n]]
Timing[g3[n]]
Timing[g4[n]]
Timing[g5[n]]

The results:

{44.5867, 637461}
{11.4228, 637461}
{4.43416, 637461}
{1.00392, 637461}
{0.004478, 637461}

Edit:

Mysticial raised the possibility that the latter routines were reaping the benefits of their order--that some caching may have been going on.

So let's run the comparison in the reverse order:

n = 2^20;
Timing[g5[n]]
Timing[g4[n]]
Timing[g3[n]]
Timing[g2[n]]
Timing[g1[n]]

Results of reversed comparisons:

{0.003755, 637461}
{0.978053, 637461}
{4.59551, 637461}
{11.2047, 637461}
{44.5979, 637461}

Verdict: A reasonable guess, but not supported by the data.

Community
  • 1
  • 1
DavidC
  • 3,056
  • 1
  • 20
  • 30
  • btw, `SquareFreeQ[#] & /@ Range[n]` is verbose; `SquareFreeQ /@ Range[n]` should suffice. – Mr.Wizard Oct 14 '11 at 19:20
  • Elaboration on the comment above: you typically should only see a pure function of the form `function[#] &` if you need to extract the first element of an expression. For example, instead of `Sqrt[First[#]]& /@ {{2, 4, 6}, {8, 10, 12}}` you might write `Sqrt[#]& @@@ {{2, 4, 6}, {8, 10, 12}}` – Mr.Wizard Oct 15 '11 at 08:29
  • @Mr.Wizard Nice tip. I'm enjoying pure functions (so much so, that I occasionally overuse them!) – DavidC Oct 15 '11 at 08:43

1 Answers1

8

ModebiusMu for smallish numbers has some dedicated fast sieving code that in effect shortcuts actual factorization, just counting as it finds factors. It will not add exponents like PrimeOmega so it really has less of a task. Also it can short-circuit whenever it detects a prime factor with multiplicity.

I believe the cutoff is, coincidently, somewhere around 2^20. Not had time to test but it may indeed get slower a bit higher.

That answers for g4. Obviously g5 is using cleverness on the part of the programmer (nothing wrong with that of course), hence the speed gain.

As for g3, it is in essence a call to g4 in terms of implementation code. The bottleneck, in this case, is preprocessing overhead because it is implemented in mathematica rather than C code. I suspect this would be less visible if larger inputs were used, since in such cases more time would be spent doing the real work rather than in Mathematica interpreter overhead. Again, I have not tested this.

Daniel Lichtblau
  • 6,854
  • 1
  • 23
  • 30
  • Thanks for the inside info about MoebiusMu. The "cleverness" of the programmer seems to have consisted in the insight that no square of a prime greater than the square root of n could be a divisor of n. – DavidC Oct 14 '11 at 19:49
  • @Mr.Wizard Not much I can say. Specifically, I've no idea whether that one indicates a bug or a feature or intentional behavior. – Daniel Lichtblau Oct 21 '11 at 20:28
  • @Daniel, thank you. Is the "phantom value" of `5` in the example given, anywhere accessible to the user? – Mr.Wizard Oct 21 '11 at 20:35
  • @Mr.Wizard I do not know but suspect it is not. Possibly it is held internally in a hash table associated with whatever DownValues et al were defined after it was introduced as a Default and before any other value succeeded it in that capacity. This is really not the best way to get viable discussion of that post, by the way. – Daniel Lichtblau Oct 22 '11 at 23:22
  • @Daniel, I am not sure how to read "This is really not the best way to get viable discussion of that post, by the way." but if I have offended you, I apologize. – Mr.Wizard Oct 22 '11 at 23:27
  • @Mr.Wizard Not offended. What I meant is you will have a better chance of getting viable answers, and in a way that can be found by searching, if you post the question to SO or MathGroup. Obviously if SO it would need to be slightly different from the original question (and this need indicates that SO is an imperfect format for long-term follow-up). Asking a handful of individuals (e.g. myself or Brett Champion) to comment on a far back post has both low probability of getting a knowledgeable answer, and small likelihood that it will show up in searches (being buried in a comment). – Daniel Lichtblau Oct 23 '11 at 15:10