I have tried a few versions of a simple predicate which pulls random values out of the extra-logical universe and puts them into a list. I assumed that the version with the accumulator would be be tail-call optimized, as nothing happens after recursive call so a path to optimization exists, but it is not (it uses the "global stack"). On the other hand, the "naive version" has apparently been optimized into a loop. This is SWI Prolog.
Why would the accumulator version be impervious to tail-call optimization?
Here are the predicate versions.
Slowest, runs out of local stack space (expectedly)
Here, we just allow a head with function symbols to make things explicit.
% Slowest, and uses 4 inferences per call (+ 1 at the end of recursion).
% Uses "local stack" indicated in the "Stack limit (1.0Gb) exceeded"
% error at "Stack depth: 10,321,204":
% "Stack sizes: local: 1.0Gb, global: 7Kb, trail: 1Kb"
oracle_rands_explicit(Out,Size) :-
Size>0, !,
NewSize is Size-1,
oracle_rands_explicit(R,NewSize),
X is random_float,
Out = [X-Size|R].
oracle_rands_explicit([],0).
?- oracle_rands_explicit(X,4).
X = [0.7717053554954681-4, 0.9110187097066331-3, 0.9500246711335888-2, 0.25987829195170065-1].
?- X = 1000000, time(oracle_rands_explicit(_,X)).
% 4,000,001 inferences, 1.430 CPU in 1.459 seconds (98% CPU, 2797573 Lips)
?- X = 50000000, time(oracle_rands_explicit(_,X)).
ERROR: Stack limit (1.0Gb) exceeded
ERROR: Stack sizes: local: 1.0Gb, global: 7Kb, trail: 1Kb
ERROR: Stack depth: 10,321,204, last-call: 0%, Choice points: 6
ERROR: Possible non-terminating recursion: ...
Faster, and does not run out of stack space
Again, we just allow a head with no function symbols to make things explicit, but we move the recursive call to the end of the body, which apparently makes a difference!
% Same number of inferences as Slowest, i.e. 4 inferences per call
% (+ 1 at the end of recursion), but at HALF the time.
% Does not run out of stack space! Conclusion: this is tail-call-optimized.
oracle_rands_explicit_last_call(Out,Size) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
Out = [X-Size|R],
oracle_rands_explicit_last_call(R,NewSize).
oracle_rands_explicit_last_call([],0).
?- oracle_rands_explicit_last_call(X,4).
X = [0.6450176209046125-4, 0.5605468429780708-3, 0.597052872950385-2, 0.14440970112076815-1].
?- X = 1000000, time(oracle_rands_explicit_last_call(_,X)).
% 4,000,001 inferences, 0.697 CPU in 0.702 seconds (99% CPU, 5739758 Lips)
?- X = 50000000, time(oracle_rands_explicit_last_call(_,X)).
% 200,000,001 inferences, 32.259 CPU in 32.464 seconds (99% CPU, 6199905 Lips)
Compact, less inferences, and does not run out of stack space
Here we allow function symbols in the head for more compact notation. Still naive recursion.
% Only 3 inferences per call (+ 1 at the end of recursion), but approx % same time as "Faster". % Does not run out of stack space! Conclusion: this is tail-call-optimized.
oracle_rands_compact([X-Size|R],Size) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
oracle_rands_compact(R,NewSize).
oracle_rands_compact([],0).
?- oracle_rands_compact(X,4).
X = [0.815764980826608-4, 0.6516093608470418-3, 0.03206964297092248-2, 0.376168614426895-1].
?- X = 1000000, time(oracle_rands_compact(_,X)).
% 3,000,001 inferences, 0.641 CPU in 0.650 seconds (99% CPU, 4678064 Lips)
?- X = 50000000, time(oracle_rands_compact(_,X)).
% 150,000,001 inferences, 29.526 CPU in 29.709 seconds (99% CPU, 5080312 Lips)
Accumulator-based and unexpectedly runs out of (global) stack space
% Accumulator-based, 3 inferences per call (+ 1 at the end of recursion + 1 at ignition),
% but it is often faster than the compact version.
% Uses "global stack" as indicated in the "Stack limit (1.0Gb) exceeded"
% error at "Stack depth: 12,779,585":
% "Stack sizes: local: 1Kb, global: 0.9Gb, trail: 40.6Mb"
oracle_rands_acc(Out,Size) :- oracle_rands_acc(Size,[],Out).
oracle_rands_acc(Size,ThreadIn,ThreadOut) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
oracle_rands_acc(NewSize,[X-Size|ThreadIn],ThreadOut).
oracle_rands_acc(0,ThreadIn,ThreadOut) :-
reverse(ThreadIn,ThreadOut).
?- oracle_rands_acc(X,4).
X = [0.7768407880604368-4, 0.03425412654687081-3, 0.6392634169514991-2, 0.8340458397587001-1].
?- X = 1000000, time(oracle_rands_acc(_,X)).
% 4,000,004 inferences, 0.798 CPU in 0.810 seconds (99% CPU, 5009599 Lips)
?- X = 50000000, time(oracle_rands_acc(_,X)).
ERROR: Stack limit (1.0Gb) exceeded
ERROR: Stack sizes: local: 1Kb, global: 0.9Gb, trail: 40.6Mb
ERROR: Stack depth: 12,779,585, last-call: 100%, Choice points: 6
ERROR: In:
ERROR: [12,779,585] user:oracle_rands_acc(37220431, [length:12,779,569], _876)
Addendum: Another version of the "compact" version.
Here we move the Size
parameter to the first position, and do not use !
. But indexing is a complex matter. Differences will probably be of note with many more clauses only.
oracle_rands_compact2(Size,[X-Size|R]) :-
Size>0,
NewSize is Size-1,
X is random_float,
oracle_rands_compact2(NewSize,R).
oracle_rands_compact2(0,[]).
Trying, with L
instead of an anonymous variable, and L
used after call.
X = 10000000, time(oracle_rands_compact2(X,L)),L=[].
% 30,000,002 inferences, 6.129 CPU in 6.159 seconds (100% CPU, 4894674 Lips)
X = 10000000, time(oracle_rands_compact(L,X)),L=[].
% 30,000,001 inferences, 5.865 CPU in 5.892 seconds (100% CPU, 5115153 Lips)
Maybe marginally faster. The above numbers vary a bit, one would really have to generate a full statistics over a hundred runs or so.
Does re-introducing the cut make it faster (it doesn't seem to make it slower)?
oracle_rands_compact3(Size,[X-Size|R]) :-
Size>0, !,
NewSize is Size-1,
X is random_float,
oracle_rands_compact3(NewSize,R).
oracle_rands_compact3(0,[]).
?- X = 10000000, time(oracle_rands_compact3(X,L)),L=[].
% 30,000,001 inferences, 5.026 CPU in 5.061 seconds (99% CPU, 5969441 Lips)
Can't say, really.