I have a function that returns integer values to integer input. The output values are relatively sparse; the function only returns around 2^14 unique outputs for input values 1....2^16. I want to create a dataset that lets me quickly find the inputs that produce any given output.
At present, I'm storing my dataset in a Map of Lists, with each output value serving as the key for a List of input values. This seems slow and appears to use a whole of stack space. Is there a more efficient way to create/store/access my dataset?
Added: It turns out the time taken by my sparesearray() function varies hugely on the ratio of output values (i.e., keys) to input values (values stored in the lists). Here's the time taken for a function that requires many lists, each with only a few values:
? sparsearray(2^16,x->x\7);
time = 126 ms.
Here's the time taken for a function that requires only a few lists, each with many values:
? sparsearray(2^12,x->x%7);
time = 218 ms.
? sparsearray(2^13,x->x%7);
time = 892 ms.
? sparsearray(2^14,x->x%7);
time = 3,609 ms.
As you can see, the time increases exponentially!
Here's my code:
\\ sparsearray takes two arguments, an integer "n" and a closure "myfun",
\\ and returns a Map() in which each key a number, and each key is associated
\\ with a List() of the input numbers for which the closure produces that output.
\\ E.g.:
\\ ? sparsearray(10,x->x%3)
\\ %1 = Map([0, List([3, 6, 9]); 1, List([1, 4, 7, 10]); 2, List([2, 5, 8])])
sparsearray(n,myfun=(x)->x)=
{
my(m=Map(),output,oldvalue=List());
for(loop=1,n,
output=myfun(loop);
if(!mapisdefined(m,output),
/* then */
oldvalue=List(),
/* else */
oldvalue=mapget(m,output));
listput(oldvalue,loop);
mapput(m,output,oldvalue));
m
}