-1

I Have the following table in Pandas.

+--------+--------+--------+-----+
|   A    |   B    |   C    |  D  |
+--------+--------+--------+-----+
| foo    | b'bar0 | b'foo0 | 253 |
| b'bar0 | bar    | blah   | 485 |
+--------+--------+--------+-----+

I want to appy a function to each element in the cell that starts with b'. The function is:

def elementdecode(data,password):
    aa=row[2:-1]
    bb=aa.encode()
    cc = bb.decode('unicode-escape').encode('ISO-8859-1')
    return (decode(cc, password).decode())

Background: I have a csv file that has normal values and encrypted values in them, and I would like to apply the decryption method only on elements that are not encrypted. My plan is to read in the csv into pandas and apply the decryption function only on cells that are encrypted (e.g. start with 'b). Once the encryption has been performed I export the data back into a new csv. Rather than using loops I was thinking to use applymap, but I don't know how to do it only on specific elements.

Thanks

valenzio
  • 773
  • 2
  • 9
  • 21

1 Answers1

1

Have you tried that ?

def elementdecode(data,password):
    #if the first condition if not met, the second is not evaluated
    if (type(x) == str) and ("\'b" in x):
        aa=row[2:-1]
        bb=aa.encode()
        cc = bb.decode('unicode-escape').encode('ISO-8859-1')
        return (decode(cc, password).decode())
    else:
        return x

df.applymap(lambda x: elementdecode(x,password))
Tbaki
  • 1,013
  • 7
  • 12
  • How is this performance wise? Doesn't the 'if' statement makes that operation very slow? – valenzio Jun 12 '17 at 09:06
  • @valenzio i don't think so, you have to filter it a way or another anyway, you can make it faster by changing what is in the if statement depending on the specificity of your data, but the logic should be there, is it working ? – Tbaki Jun 12 '17 at 09:09
  • I updated my Question, the problem with my Dataframe is that it has different formats in them, strings, ints, date, NaN. Your approach only works if all entries are strings. I guess I could format all the entries into strings, since I am writing them back into the csv anyway. I will report back. – valenzio Jun 12 '17 at 09:12
  • @valenzio you can pass the condition in your function to have more liberty on the filtering – Tbaki Jun 12 '17 at 11:56