1

Cannot figure out this error message. Can someone help? Error appears on the re.findall line.

import re, urllib.request
infile = open('phone_numbers.txt')

for line in infile:
    line = line.strip()

    area=line[0:3]
    area1=line[5:7]
    area2=line[8:12]

    xyz = 'http://usreversephonedirectory.com/results.php?areacode='+ area +'&phone1='+ area1 +'&phone2='+ area2 +'&imageField.x=193&imageField.y=16&type=phone&Search=Search&redir_page=results%2Fphone%2F'

    print(area + area1 + area2)

    page = urllib.request.urlopen(xyz)
    text = page.read()
    text = text.strip()

    location = re.findall('>Location:</strong>(.+)</span><br/>            <span><strong>Line', text)

    print(line + '|' + location[0])

infile.close()
AVI
  • 5,516
  • 5
  • 29
  • 38

1 Answers1

1

Your text is being to read as binary as @Ben said. Using his method of decoding text.strip() the error disappears. The method I used is as follows. You may want to fix it's output from an aesthetic perspective. Hope this helps!

$ echo "1 (800) 233-2742" >> phone_numbers.txt # Put a random number into phone_numbers.txt
$ python lookup.py                             # Run the fixed program
1 (0)233-                                      # Output line 1
1 (800) 233-2742| ,                            # Output line 2
$                                              # Done

The code (updated):

import re, urllib.request
infile = open('phone_numbers.txt')

for line in infile:
    line = line.strip()

    area=line[0:3]
    area1=line[5:7]
    area2=line[8:12]

xyz = 'http://usreversephonedirectory.com/results.php?areacode='+ area +'&phone1='+ area1 +'&phone2='+ area2 +'&imageField.x=193&imageField.y=16&type=phone&Search=Search&redir_page=results%2Fphone%2F'

print(area + area1 + area2)

page = urllib.request.urlopen(xyz)
text = page.read()
text = text.strip().decode('utf-8')

location = re.findall('>Location:</strong>(.+)</span><br/>            <span><strong>Line', text)

print(line + '|' + location[0])

infile.close()
Koga
  • 523
  • 4
  • 13
  • Thank you for the help! I am encountering another problem... Why is my code not extracting the location from the website? Not sure what to use in location = re.findall(... to make pull the correct location data. – Eric Henderson Feb 22 '16 at 07:45
  • website is:http://usreversephonedirectory.com/results.php?areacode=408&phone1=857&phone2=7713&imageField.x=0&imageField.y=0&type=phone&Search=Search&redir_page=results%2Fphone%2F – Eric Henderson Feb 22 '16 at 07:46