First I create a data set using pandas function pd.read_sql(). As far as I know, all imported columns are strings.
Then I create a new null variable and define a function, like so (tinyurl.com/tnr9b83):
df['status_update'] = ""
def f(row):
if (row['priority'] in ("1","2")) and (row['failed'] == "Y"):
val = "F"
elif (row['priority'] in ("1","2")):
val = row['status'].str.slice(0,1)
else:
val = "X"
return val
Then I try to change every row of my data set so that:
- if a record has priority in ("1","2") and failed = "Y", it gets status_update = "F"
- else if a record has priority in ("1","2"), it gets a status_update = the first letter substring of column 'status'
- else it gets status_update = "X"
So I run:
df['status_update'] = df.apply(f, axis=1)
..but this gives:
AttributeError: 'str' object has no attribute 'str'
I've tried alternative syntax to no avail. Others who report this error seem to have different situations and resolutions. As a new python programmer, what are the best first steps/tools/functions toward understanding why this syntax/logic won't work in this situation?
Edit: clarification: error is related to "val = row['status'].str.slice(0,1)" Edit2: worth noting, when I opened the data viewer it had something like []...[]...[] instead of a single character value for many observations in the new 'status_update' field, so I'm guessing that some kind of array or vector is being returned instead of a single substring.