0

I want to match the values of a specific column of a txt file named A with 10 other txt files named (A_1,A_2......A_10) and find the sum of square of differences for each txt files. So, basically, i want Python to print out the 3 smallest differences and their corresponding txt files (A_1,A_2,.....A_10). I have been able to find the differences but stuck at how to find the 3 smallest differences and their corresponding txt files.

import numpy as np
filelist=[]
for i in range(1,11):
    filelist.append("/Users/Hrihaan/Desktop/A_%s.txt" %i)
for fname in filelist:
    data=np.loadtxt(fname)
    data1=np.loadtxt('/Users/Hrihaan/Desktop/A.txt')
    x=data[:,1]
    x1=data1[:,1]
    x2=(x-x1)**2
    x3=sum(x2)
    print(fname)
    print(x3)
Hrihaan
  • 275
  • 5
  • 21
  • What's your definition of "Smallest differences"? Do you mean the file with the fewest different entries in that column you're looking for? – Davy M Jul 29 '17 at 03:21
  • So basically I want to find the sum of differences of values of a specific column for each of the 10 txt files (A_1......A_10) by comparing it with the txt file named A. I am looking to find out among these 10 txt files, which 3 txt files have the smallest sum of differences@DavyM – Hrihaan Jul 29 '17 at 03:48

1 Answers1

1

Your current code finds the differences for each file.

You can store these in a list (filesAndDiffs) of tuples containing each filename paired with its difference.

At the end, you need to sort this list based on the 2nd element in each tuple key=lambda x: x[1] and then print out this pair. To print out only the smallest three use [:3].

import numpy as np
filelist=[]
for i in range(1,11):
    filelist.append("/Users/Hrihaan/Desktop/A_%s.txt" %i)
filesAndDiffs = []
for fname in filelist:
    data=np.loadtxt(fname)
    data1=np.loadtxt('/Users/Hrihaan/Desktop/A.txt')
    x=data[:,1]
    x1=data1[:,1]
    x2=(x-x1)**2
    x3=sum(x2)
    filesAndDiffs.append((fname, x3))
print("Filename, Diff")  # Print a title for the table
for fname, diff in sorted(filesAndDiffs, key=lambda x: x[1])[:3]:
    print(fname, diff)