0

I'm new to python What I want is to be able to print content of a file I have like this..

Mashed Potatoes , topped with this and that ...................... 9.99$

similarly

Product_name , description ......................... price

when I match it with a file containing only Product_names

Mashed Potatoes

Past

Caesar Salad

etc. etc.

The content of the first file are not in a uniform order so that's why I'm trying it with search ,match and print approach

I hope you understand my problem

This is what I have tried

     import re

      content_file = open('/Users/ashishyadav/Downloads/pdfminer-20110515/samples/te.txt',"r")
      product_list = open('/Users/ashishyadav/Desktop/AQ/te.txt',"r")
      output = open("output.txt" , "w")
      line = content_file.read().lower().strip()
      for prod in product_list:
        for match in re.finditer(prod.lower().strip(), line):
         s=match.start()
         e=match.end()
         print >>output, match.group(),"\t",
         print >>output, '%d:%d' % ( s, e),"\n",

what my code does is it matches the second product list file with the full content file but gives me just the index of the product_Names not the description and price ..

what I want is an index/span from Product_name to price..

like from mashed potatoes ---- 9.99$( mashed potatoes - [0:58]),,m just getting [0:14]

and also any way to print the description and price using the same approach

Thanks in advance

Community
  • 1
  • 1
boyfromnorth
  • 958
  • 4
  • 19
  • 41

1 Answers1

1
  • Read the whole "second file" into a set X.
  • Read the "first" file line by line.
  • For each line, extract the part before the comma.
  • If this part is in the set X, print whatever is desired.

Let me know if you need this in python.

# Read the whole "second file" into a set X.
with open('foo') as fp:
    names = set(fp)

# Read the "first" file line by line.
with open('bar') as fp:
    for line in fp:

        # For each line, extract the part before the comma.
        name = line.split(',')[0]

        # If this part is in the set X, print whatever is desired.
        if name in names:
             print line
georg
  • 211,518
  • 52
  • 313
  • 390
  • Well Yup that's what I'm trying to do..I just don't need the part before comma...M already getting the Item_names and their index location when I match the two files..What I want is to get the Full content that is name,description,prie with the starting index and end point of the products full content..I hope I'm able to explain.. – boyfromnorth May 18 '12 at 09:40
  • CAn you please write a sample code or something from which I can get a better Idea? how to extract the part before comma and than whatever desired? – boyfromnorth May 18 '12 at 09:53
  • @ashish.god5: I still fail to understand what you're trying to achieve here. What is the ultimate goal? – georg May 18 '12 at 13:35
  • can you please give me a sample python code or something of your above explained mechanism..all the steps and stuff..it will help me solve my problem – boyfromnorth May 21 '12 at 06:32