1

I am trying to use Regex to return Canadian postal codes through a each line of a CSV file.

Environment: Python 3.6 on Win 10. Code tested through Jupyter Notebook and through the Win 10 CLI prompt.

The problem is that I can't seem to get the object to return the string when found using a FOR LOOP through a CSV file.

Using re through a list works fine:

import re    
address = [ 'H1T3R9',
          '/a/b/c/la_seg_x005_y003.npy',
          'H1K 3H3',
          'F2R2V2',
          'H1L 3W6',
           'j1r 4v5',
          '/y', 
          'h2r 2x8',
          'J9R 5V9',
          'Non disponible, h2r 2x8, montreal']

# I also tried this one at some point,# r'^((\d{5}-\d{4})|(\d{5})|([AaBbCcEeGgHhJjKkLlMmNnPpRrSsTtVvXxYy]\d[A-Za-z]\s?\d[A-Za-z]\d))$))
regex = re.compile(r'\b[a-z]\d[a-z]\s\d[a-z]\d\b')

goodPostalCode = filter(regex.search, address)
print(*goodPostalCode)

Output:

j1r 4v5 h2r 2x8 Non disponible, h2r 2x8, montreal

But when adding the CSV component it seems to break.

import re
import csv
with open('data.csv', newline='') as f:
    reader = csv.reader(f)
    for row in reader:
        #print(row)
        regex = re.compile(r'\b[a-z]\d[a-z]\s\d[a-z]\d\b')
        postcode = filter(regex.search, row[7])
        print(postcode)

Output:

<filter object at 0x000001E4FA70D908>

The object filter object seems to be found every iteration

My understanding was that I could loop through a CSV as each line would return a list or a tuple, then I could use *re to find matching patterns in the string at a specific column using its index.

Where do I go wrong here?

Brendan Abel
  • 35,343
  • 14
  • 88
  • 118
Davesdere
  • 11
  • 3

1 Answers1

0

You shouldn't need to use filter in the loop, since the value of row[7] is a string, not a list of strings.

codes = []
with open('data.csv', newline='') as f:
    reader = csv.reader(f)
    for row in reader:
        if regex.search(row[7]):
            codes.append(row[7])

Alternatively, you could create a list of the lines first, and then run filter

with open('data.csv', newline='') as f:
    reader = csv.reader(f)
    lines = [row[7] for row in reader]

regex = re.compile(r'\b[a-z]\d[a-z]\s\d[a-z]\d\b')
goodPostalCode = filter(regex.search, lines)
Brendan Abel
  • 35,343
  • 14
  • 88
  • 118
  • Both codes are not returning any value. I found a way by using some custom class I found online called RE. I just did few modifications to fit my need and it works. I'm not sure why this code or my previous code didn't work though. – Davesdere Mar 19 '18 at 20:39