Solution
I ended up using the following implementation; a slight modification of the answer by David Fletcher:
isect :: Eq a => [a] -> [a] -> [a]
isect [] = const [] -- don't bother testing against an empty list
isect xs = catMaybes . diagonal . map matches
where matches y = [if x == y then Just x else Nothing | x <- xs]
This can be augmented with nub to filter out duplicates:
isectUniq :: Eq a => [a] -> [a] -> [a]
isectUniq xs = nub . isect xs
Explanation
Of the line isect xs = catMaybes . diagonal . map matches
(map matches) ys
computes a list of lists of comparisons between elements of xs
and ys
, where the list indices specify the indices in ys
and xs
respectively: i.e (map matches) ys !! 3 !! 0
would represent the comparison of ys !! 3
with xs !! 0
, which would be Nothing
if those values differ. If those values are the same, it would be Just
that value.
diagonals
takes a list of lists and returns a list of lists where the nth output list contains an element each from the first n lists. Another way to conceptualise it is that (diagonals . map matches) ys !! n
contains comparisons between elements whose indices in xs
and ys
sum to n
.
diagonal
is simply a flat version of diagonals
(diagonal = concat diagonals
)
Therefore (diagonal . map matches) ys
is a list of comparisons between elements of xs
and ys
, where the elements are approximately sorted by the sum of the indices of the elements of ys
and xs
being compared; this means that early elements are compared to later elements with the same priority as middle elements being compared to each other.
(catMaybes . diagonal . map matches) ys
is a list of only the elements which are in both lists, where the elements are approximately sorted by the sum of the indices of the two elements being compared.
Note
(diagonal . map (catMaybes . matches)) ys
does not work: catMaybes . matches
only yields when it finds a match, instead of also yielding Nothing
on no match, so the interleaving does nothing to distribute the work.
To contrast, in the chosen solution, the interleaving of Nothing
and Just
values by diagonal
means that the program divides its attention between 'searching' for multiple different elements, not waiting for one to succeed; whereas if the Nothing
values are removed before interleaving, the program may spend too much time waiting for a fruitless 'search' for a given element to succeed.
Therefore, we would encounter the same problem as in the original question: while one element does not match any elements in the other list, the program will hang; whereas the chosen solution will only hang while no matches are found for any elements in either list.