I have two 2D numpy arrays shaped:
(19133L, 12L)
(248L, 6L)
In each case, the first 3 fields form an identifier.
I want to reduce the larger matrix so that it only contains rows with identifiers that also exist in the second matrix. So the shape should be (248L, 12L). How can I do this?
I would then like to sort it so that the arrays are indexed by the first value, second value and third value so that (3 3 4) comes after (3 3 5) etc. Is there a multi field sort function?
Edit:
I have tried pandas:
df1 = DataFrame(arr1.astype(str))
df2 = DataFrame(arr2.astype(str))
df1.set_index([0,1,2])
df2.set_index([0,1,2])
out = merge(df1,df2,how="inner")
print(out.shape)
But this results in (0,13) shape