I have 3 files that contain some arbitrary number of rows (specified in the first line). I want to get all the common rows in those files. For example, in every file, I have a number of rows the file contains and each line contains four space-separated coordinates.
file1.txt:
5
820.3 262.48 637.815 232.503
657.666 773.366 466.608 754.035
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
518.451 899.594 343.431 881.08
file2.txt
3
1.52 6.878 9.5485
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
file3.txt
4
657.666 773.366 466.608 754.035
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
518.451 899.594 343.431 881.08
My output file res.txt should be:
res.txt
2
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
Here we have 2 common rows and hence that should be printed in the first line. How to scale this for multiple files?
I have tried writing a python script for handling two files, but I think it's not so efficient. The code I tried is:
import numpy as np
l1 = []
l2 = []
with open('matchings1_2.txt', 'r') as f1:
for line in f1:
line = line.split()
l1.append(line)
with open('matchings2_3.txt', 'r') as f2:
for line in f2:
line = line.split()
l2.append(line)
l1 = np.array(l1[1:]).astype(float)
l2 = np.array(l2[1:]).astype(float)
l = []
for r in l1:
if r in l2:
l.append(list(r))
l.insert(0, [len(l)])
with open('Result.txt', 'w') as f:
for item in l:
s = ""
for i in range(len(item)):
if (i != len(item) - 1):
s += str(item[i]) + " "
else:
s += str(item[i])
f.write("%s\n" % s)