0

I need your help because after some long research I didn't find the appropriate answer to my problems.

I have 2 files which contain some information. Some of this information are similar others are different. The first file is sorted the second one is not.

I tried to use the difflib but it doesn't work in my case , apparently.

Example

File 1 :

customerID: aa
companyName: AA
contacts: AAAA AAAA <aa@aa.fr>

File 2 :

customerID: zz
username: z.z
contacts: ZZZ ZZZ <zz@zz.com>

I need to find if the customerID is the same

Here is my code :

import sys
import string
import difflib                                               

def changes(file1, file2):
    # opening the 2 files which we need to compare                                 
    master = open(file1, 'r')
    slave = open(file2, 'r')

    # searching diff                                                               
    diff = difflib.unified_diff(master.readlines(),slave.readlines())
    t = ''.join(diff)
    print (t)




def main(argv=2):
    print (sys.argv[1])
    print (sys.argv[2])
    if argv == 2:
        changes(sys.argv[1], sys.argv[2])
    else:
        print ("This program need 2 files")
        exit (0)
    return 0

    if __name__ == '__main__':
   status = main()
   sys.exit(status)

Edit : The file are txt that i have formated like this myself.

CRC
  • 27
  • 8
  • Have you tried using regex to extract customerID info from both files and compare them? – Lucas Infante Oct 13 '15 at 13:19
  • I'm not sure what you want to do. Do you want to check if all custermIDs from first file are also present in the second file? – Mihai Hangiu Oct 13 '15 at 13:22
  • I want to check if the customerID in the two files are the same and only print the ones which are not – CRC Oct 13 '15 at 13:38
  • maccinza No, i read a lot about python and mostly of my research are not ok with using regex and i don't know how to use regex – CRC Oct 13 '15 at 13:39
  • I suggest you to read https://docs.python.org/2/library/re.html and build a regular expression to filter out the info you want from each file. – Lucas Infante Oct 13 '15 at 13:45
  • @maccinza I wiil read it thank you. I read a lot of documentation on the web – CRC Oct 13 '15 at 13:55

1 Answers1

0
with open('first.txt', 'r') as first_file:
   for line in first_file:
       data = line.split(":")
       if data[0].trim() == "customerID":
          customer_id  = data[1].trim()
          with open('second.txt', 'r') as second_file:
            for second_file_line in second_file:
            data2 = line.split(":")
            if data2[0].trim() == "customerID":
              if customer_id == data2[1].trim():
                <do your work>

If your files is too big then searching in second file is

with open('second.txt', 'r') as second_file:
for line in second_file:
    if customer_id in line:
       <do your work>

or if files are small enough then

if customer_id in open('second.txt').read():
      <do your work>
Sanch
  • 137
  • 7
  • Yes i tried this didn't work properly as i wanted too. – CRC Oct 13 '15 at 13:41
  • Can you tell me about your problem which you faced when you tried it. – Sanch Oct 13 '15 at 13:44
  • I didn't faced any technical problem only that when i used this method it printed like that : customerID: ZZZ customerID: WWWW customerID: XXXX And i need to only have the value of the customerID to compare between the two files – CRC Oct 13 '15 at 13:53
  • If I understand you correctly, you need only customer ID like zzz,wwww,xxxx so for this you can either split that line string or use regx. – Sanch Oct 13 '15 at 13:56
  • That's it ! i need to only compare the value of the customerID in the two files. – CRC Oct 13 '15 at 13:59
  • Ok, you just need to compare only customerId not whole string. whole string may be diff but customer id will be same. – Sanch Oct 13 '15 at 14:03
  • for above solution my assumption is your files are large. – Sanch Oct 13 '15 at 14:11