0

Given an array N of 1,000,000 unique integers ranging from 0 to 1,999,999. What is the fastest way to filter out integers that do not exist within any range inside of M - where M is a fixed group of 10 random ranges each with integers ranging from 0 to 1,999,999?

Short sample with smaller numbers:

Given this set N of unique integers: [1,5,7,8,20,22,30] and this set M of ranges: [(1,6) , (19,21), (23,50)]

Find what values of N exist within any range of M (inclusive bounds)

Solution: [1,5,20,30]

Java is preferred (to run time/complexity tests) but any other language is fine

fastmath
  • 1
  • 2
  • 2
    It might not turn out to be worth the RAM, but since the range is half full to begin with, and you are going for speed, consider populating the initial array sparsely, as an array of 2M items, with each populated item having the value of its index, and non-populated array cells empty. Then use "Slice" operations to pull out the included ranges and append them to a Results array. Filter out the empty values. Speed should be hard to beat, but it won't be the best on memory consumption. – Aadmaa May 19 '21 at 02:25
  • (Just had a flashback to *convert to a combinatorial circuit, minimise, convert to source code, see what an "optimising" compiler spews out*. **Not** helpful for a single million of values, but if `fixed group of [ranges]` means *the same ten ranges for many millions*…) – greybeard May 19 '21 at 03:57
  • What is the issue in simple linear iteration? That should be efficient enough imo – Abhinav Mathur May 19 '21 at 04:17
  • @AbhinavMathur What do you mean? – fastmath May 19 '21 at 13:00
  • @fastmath iterate through the lists, using a pointer for each list. Include the current number in the final list if it belongs in the current range. If not, check if it belongs to the next range (until the range doesn't include the given element) – Abhinav Mathur May 19 '21 at 13:28

1 Answers1

0

Using python3. Hope this is helpful

import numpy as np
ranges = [(1,6) , (19,21), (23,50)]
nums = [1,5,7,8,20,22,30]

max = ranges[-1][1]
ind_array = np.zeros(max)

for r in ranges:
  ind_array[r[0]-1:r[1]] = 1

lst = []
for n in nums:
  if ind_array[n-1] == 1:
    lst.append(n)

print(lst)