7

I'm performing a statistical analysis to determine the possibility of whether a larger transaction has been hidden by breaking it into smaller transactions within a certain time frame. What I'm doing is breaking the larger data set into smaller subsets (arrays of 12, at the moment) and then running a series of loops over each subset to determine whether any combination of the elements add up to within a target range.

Here's my current code:

amounts_matrix = [1380.54,9583.33,37993.04,3240.96...]


matrix_amounts = amounts_matrix.length
total_permutations = 0;
total_hits = 0;
target_range = 1
target = 130000
low_threshold = target - target_range
high_threshold = target + target_range
entries = []
range = 12


for(x = 0; x< matrix_amounts-(range-1); x++){
    amounts = amounts_matrix.slice(x, x+range)
    total_amounts = range


    for(i = 0; i< total_amounts; i++){
        entries.push(amounts[i])
        totalcheck(entries)
        entries = []
    }

    for(i = 0; i< total_amounts; i++){
        for(j = i+1; j< total_amounts; j++){
            entries.push(amounts[i])
            entries.push(amounts[j])
            totalcheck(entries)
            entries = []
        }
    }

    ...

    for(i = 0; i< total_amounts; i++){
        for(j = i+1; j< total_amounts; j++){
            for(k = j+1; k< total_amounts; k++){
                for(l = k+1; l< total_amounts; l++){
                    for(m = l+1; m< total_amounts; m++){
                        for(n = m+1; n< total_amounts; n++){
                            for(o = n+1; o< total_amounts; o++){
                                for(p = o+1; p< total_amounts;p++){
                                    for(q = p+1; q< total_amounts;q++){
                                        for(r = q+1; r< total_amounts;r++){
                                            for(s = r+1; s< total_amounts;s++){
                                                for(t = s+1; t< total_amounts;t++){
                                                    entries.push(amounts[i])
                                                    entries.push(amounts[j])
                                                    entries.push(amounts[k])
                                                    entries.push(amounts[l])
                                                    entries.push(amounts[m])
                                                    entries.push(amounts[n])
                                                    entries.push(amounts[o])
                                                    entries.push(amounts[p])
                                                    entries.push(amounts[q])
                                                    entries.push(amounts[r])
                                                    entries.push(amounts[s])
                                                    entries.push(amounts[t])
                                                    totalcheck(entries)
                                                    entries = []
                                                }
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }


}


function totalcheck(array){ 

    total_permutations += 1;

    sum_amount = 0
    for(z = 0; z < array.length; z++){
        sum_amount += array[z]
    }

    if(sum_amount > low_threshold && sum_amount < high_threshold){
        total_hits += 1
        console.log(array)
        console.log(sum_amount.toFixed(2))
        console.log("---------------")
    }

}

console.log("overall total hits = " + total_hits)
console.log("overall total permutations = " + total_permutations)

I'm pretty embarrassed by how extensive those for loops get, and I'd like to generalize it with a function where I can just tell it to run X loops rather than having to build them out like this. The permutation functions I've found aren't really viable for me because they all build arrays full of the total possibilities; in mine I want to check against the target as I go to avoid having gigantic arrays and running into memory issues. How do I build a recursive loop that will do this?

asetniop
  • 523
  • 1
  • 5
  • 12
  • What's wrong with using memory? I read your problem and the first thing I thought was "build an array of total possibilities". Is that a problem? Why? – Ben Mar 10 '18 at 21:21
  • You c̶a̶n̶n̶o̶t̶ cannot reasonably create a function to create loops, the eval() function may help you there, however, a function that manipulates the number of loops it runs can be done quite easily, and a callback can even be supplied to give unique functionality to each loop – Ben Mar 10 '18 at 21:28
  • Because I want to be able to do it with larger windows than just 12 - which would get me into the realm of arrays that are 50! long or even bigger. – asetniop Mar 10 '18 at 21:40
  • Could you explain what this should do? I guess there is a muvh shorter + better solution to it – Jonas Wilms Mar 10 '18 at 21:52
  • Sure - given a base set like [1,2,3,4] and a target value of 7, it should spit out [3,4] and [1,2,4]. – asetniop Mar 10 '18 at 22:04
  • The number of combinations you are considering is the sum of pascal triangle rows, which is \(2^n\), so this is an exponential time algorithm. To make it more tractable I would suggest some coding logic. First order the array largest first so that you have a regular structure to work with. It would probably help with some mathematical analysis. For example, if the mean of your array is less than target / length of array then you will have zero answers, etc. – Attack68 Mar 10 '18 at 22:43
  • Sorting isn't an option, the financial aspect means they have to stay in the original sequence so you can check for combinations within a certain time frame. – asetniop Mar 10 '18 at 22:46
  • 1
    I understand that 'amounts_matrix' cannot be reorganised, but 'amounts' is your sub-sampled array from that upon which you perform the loops, unless I'm mistaken. That is what I suggest you order, it would considerably increase the efficiency, with included logic. – Attack68 Mar 10 '18 at 23:02
  • @asetniop if you're within a certain time-frame, you could possibly copy those transactions and sort them. – גלעד ברקן Mar 10 '18 at 23:20
  • it is also more efficient to evaluate all combinations of the largest number of items first. For example if you determine that no combination of 7 items produces a sum higher than the lower threshold then you can be certain that no combination of 6 or less items will produce the right sum either and hence you escape about 1/3 of the total work – Attack68 Mar 10 '18 at 23:25

4 Answers4

1

You could build a list of indices that you are going to check:

 const positions = Array.from({length: 12}, (_, i) => i);

Now we need to take the highest index, increase it, and when we reach the upper array boundary, we increase the second highest index and so on, so we slowly go over all combinations:

 function next(){
   for(let i = positions.length - 1; i >= 0; i--){
      if(positions[i] < amounts.length){
        positions[i]++;
        return true;
      }
      if(i == 0) return false;
      positions[i] = positions[i - 1] + 2;
   }
 }

If that seems spooky, try it here

Now that weve got the indices, we just need to sum up the arrays values they refer to until we find our target:

  do {
    const sum = positions.reduce((sum, pos) => sum + amounts[pos], 0);
    if(sum === target) break;
  } while(next())

To get all permutated sums with different lengths just run the whole thing multiple times with different lengths.

Jonas Wilms
  • 132,000
  • 20
  • 149
  • 151
  • This looks very promising. Let me get my head around it, but I think it's going to do the trick. – asetniop Mar 10 '18 at 22:49
  • It gets wonky and doesn't seem to stay within the parameters (eventually the count in the last column gets too high; should max out at 10 but goes up to 11 and 12). I used this method as the foundation for my answer though, so thank you! – asetniop Mar 11 '18 at 03:28
1

Since you've tagged and titled the question "recursion," let's build a recursion.

Let's also assume we'll provide the function sorted input so we can avoid all n choose k subsets in favour of an early exit if the next amount is too large. (If the input is not sorted, we can simply remove the check for "too large" in the functions below.)

(Note that JavaScript, at least in the browser, offers limited recursion depth so you might consider converting the process to an explicit stack iteration.)

// Returns indexes of elements that compose sums within the specified range
function f(amounts, i, low, high, k, sum, indexes){
  if (!k)
    return low < sum && sum < high ? [indexes] : [];
    
  if (i == amounts.length || amounts.length - i < k)
    return [];
  
  if (sum + amounts[i + 1] > high)
    return low < sum ? [indexes] : [];

  let _indexes = indexes.slice();
  _indexes.push(i);

  return f(amounts, i + 1, low, high, k - 1, sum + amounts[i], _indexes)
           .concat(f(amounts, i + 1, low, high, k, sum, indexes));
}

console.log(JSON.stringify(f([1,2,3,4], 0, 6, 8, 3, 0, [])));
console.log(JSON.stringify(f([1,2,3,4], 0, 4, 7, 2, 0, [])));
console.log(JSON.stringify(f([1,2,3,4], 0, 4, 7, 3, 0, [])));

The above version limits the search to a specific number of transactions, k. The version I first posted was for general k, meaning subsets of any cardinality:

function f(amounts, i, low, high, sum, indexes){
  if (i == amounts.length)
    return low < sum && sum < high ? [indexes] : [];
  
  if (sum + amounts[i + 1] > high)
    return low < sum ? [indexes] : [];

  let _indexes = indexes.slice();
  _indexes.push(i);
  
  return f(amounts, i + 1, low, high, sum + amounts[i], _indexes)
           .concat(f(amounts, i + 1, low, high, sum, indexes));
}

console.log(JSON.stringify(f([1,2,3,4], 0, 6, 8, 0, [])));
console.log(JSON.stringify(f([1,2,3,4], 0, 4, 7, 0, [])));
גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
1

I ended up using the indices suggestion from Jonas W and here's what ended up working for me. I can change the window size by changing the range_window variable.

const amounts_matrix = [1380.54,9583.33,37993.04,3240.96,9583.33,814.24,6000.00.....


total_permutations = 0;
total_hits = 0;
target_range = 1
target = 130000
low_threshold = target - target_range
high_threshold = target + target_range
range_window = 12
batch_max = 12


for(x = 0; x < amounts_matrix.length-(range_window-1); x++){
    amounts = amounts_matrix.slice(x, x + range_window)

    for(batch_size = 0; batch_size <= batch_max; batch_size++){

        const positions = Array.from({length: batch_size}, (_, i) => i);
        //calculate the upper thresholds for each position
        var position_thresholds = new Array(batch_size)
        for(i = 0; i < positions.length; i++){
            position_thresholds[i] = i + amounts.length - positions.length
        }   

        var position = positions[positions.length-1];

        while(positions[0] < position_thresholds[position]){
            stormy_loop(positions, position)
        }

    }
}


function stormy_loop(positions, position){
    if(positions[position] <= position_thresholds[position]){
        totalcheck(positions)
        positions[position] += 1;
    }
    else{
        while(positions[position] > position_thresholds[position]){
            position -= 1
            positions[position] += 1;
        }
        cascade(positions,position)   
    }
}

function cascade(positions,position){
    base = positions[position]
    for(i = position + 1; i < positions.length; i++){
        position += 1
        base += 1
        positions[position] = base;
    }
}


function totalcheck(array){ 

    total_permutations += 1;
    output_array = []

    sum_amount = 0
    for(z = 0; z < array.length; z++){
        sum_amount += amounts[array[z]]
        output_array.push(amounts[array[z]])
    }

    if(sum_amount > low_threshold && sum_amount < high_threshold){
        total_hits += 1
        console.log(output_array)
        console.log(sum_amount.toFixed(2))
        console.log("total hits = " + total_hits)
        console.log("total permutations = " + total_permutations)
        console.log("---------------")
    }

}
asetniop
  • 523
  • 1
  • 5
  • 12
0
 for(i = 0; i< total_amounts; i++){
        for(j = i+1; j< total_amounts; j++){
            for(k = j+1; k< total_amounts; k++){
                for(l = k+1; l< total_amounts; l++){
                    for(m = l+1; m< total_amounts; m++){
                        for(n = m+1; n< total_amounts; n++){
                            for(o = n+1; o< total_amounts; o++){
                                for(p = o+1; p< total_amounts;p++){
                                    for(q = p+1; q< total_amounts;q++){
                                        for(r = q+1; r< total_amounts;r++){
                                            for(s = r+1; s< total_amounts;s++){
                                                for(t = s+1; t< total_amounts;t++){
                                                    entries.push(amounts[i])
                                                    entries.push(amounts[j])
                                                    entries.push(amounts[k])
                                                    entries.push(amounts[l])
                                                    entries.push(amounts[m])
                                                    entries.push(amounts[n])
                                                    entries.push(amounts[o])
                                                    entries.push(amounts[p])
                                                    entries.push(amounts[q])
                                                    entries.push(amounts[r])
                                                    entries.push(amounts[s])
                                                    entries.push(amounts[t])
                                                    totalcheck(entries)
                                                    entries = []
                                                }
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }

Can become something like this

function loopinator(){
    for(i=0; i<total_amounts; i++){
        for(j=0; j<11; j++){//You had 11 loops plus root loop
            entries.push(amounts[( i+j )]);//Will increase with each iteration of j loop, simulating loop branching
        }
        //Will run at same time as it did before, after all nested roots, but before next iteration of i loop
        totalcheck(entries);
        entries = [];

    }
}
Ben
  • 2,200
  • 20
  • 30
  • This only seems to target things that occur in direct sequence; if my base set is [1,2,3,4,5,6] and my target is 9 it will catch [2,3,4] and [4,5] but not [1,2,6] and [3,6]. – asetniop Mar 10 '18 at 22:00