2

I am using an address parsing library which accepts strings in the following way

import pyap
test_address = """
   4998 Stairstep Lane Toronto ON   
    """
addresses = pyap.parse(test_address, country='CA')
for address in addresses:
        # shows found address
        print(address)
        # shows address parts
        print(address.as_dict())

I would like to use this function on every row of a single pandas data-frame column.The dataframe contains two columns (id,address) This is what I have so far

addresses.apply(lambda x: pyap.parse(x['address'], country='CA'),axis=1)

Though this runs, it results in a series instead of a 'pyap.address.Address'

Dominic Naimool
  • 313
  • 2
  • 11
  • *would like to use this function on every row of a single pandas data-frame column* ... , then the series returns for each row the result of applying the function ... what is the problem? – ansev May 21 '20 at 21:01
  • the problem is that it is returning a series instead of a data type from the package, do you know if there is a way to pass them independently to the function? – Dominic Naimool May 21 '20 at 21:25

1 Answers1

1

You have to do what you do, but in reverse: Let's say your dataframe is this:

d = [{'id': '1', 'address': '4998 Stairstep Lane Toronto ON'}, {'id': '2', 'address': '1234 Stairwell Road Toronto ON'}] 
df = pd.DataFrame(d) 
df

    id  address
0   1   4998 Stairstep Lane Toronto ON
1   2   1234 Stairwell Road Toronto ON

Extract these addresses to a list

address_list = df['address'].tolist()

and then process each with pyapp:

for al in address_list:
   addresses = pyap.parse(al, country='CA')
   for address in addresses:
        print(address)
        print(address.as_dict())

Let me know if it works.

Jack Fleeting
  • 24,385
  • 6
  • 23
  • 45
  • This was perfect at accomplishing the first component of the task but it does not compile the results of the operation into a data structure. This was the intent of using a DF to begin with – Dominic Naimool May 22 '20 at 14:46
  • @DominicNaimool I'm not sure what you mean. Probably best thing to do is edit your question to add the exact expected output. – Jack Fleeting May 22 '20 at 17:09