I'm currently trying to convert a string representation of a list of lists to a list of lists using the ast.literal_eval
method. I've tried looking at the following questions on this community:
- Malformed String ValueError ast.literal_eval() with String representation of Tuple
- python ast.literal_eval throwing malformed string error given “datetime.datetime.now()”
but the solutions and answers offered don't seem to be applying to my situation.
I currently have a Pandas DataFrame of the form (example):
industry index entities
cars 0 [ ['car1', 'it'], ['them', 'car2', 'car3'] ]
cars 1 [ ['car4', 'its'], ['car5', 'car6'] ]
When I load in the CSV file using pandas.read_csv
, the entries in column entities
are string representations of lists. I attempted to use ast.literal_eval
to convert them into lists but the following happens:
df['entities'] = ast.literal_eval(df['entities'])
ValueError: malformed node or string: 0 [['car1', 'it'], ['them', 'car2', 'car3']]
1 [['car4', 'its'], ['car5', 'car6']]
I'm aware that the arguments used in ast.literal_eval
must be Python literal structures, but nothing in the arguments I'm passing don't seem to not be Python literals, so that doesn't seem to be the problem.
To provide some additional background information, I used this same method to perform an identical operation before and it worked fine. However, I recently modified the original DataFrame to remove instances of the word "the."
What might be causing this error? Any tips would be appreciated. Thank you.
Edit
df.head(2).to_dict()
returns the following. Note that this is different from the example I provided because this is the original DataFrame that I'm working with:
{'industry': {0: 'automotiveEngineering', 1: 'automotiveEngineering'},
'index': {0: 0, 1: 1},
'entities': {0: "[['Norway', 'it'], ['EQC—and', 'it', 'EQC', 'EQC'], ['Mercedes-Benz EQC Edition 1886 electric SUV', 'it', 'it', 'EQC400 4Matic crossover']]",
1: '[[\'Ford Fusion\', \'Fusion\', \'Fusion\', \'Fusion\'], ["2013–2016 Ford Fusion sedans.automaker \'s", \'automaker\'], [\'Ford\', \'Ford\'], [\'faulty shifter cables that can cause rollaways\', \'these shifter cables , which can break off transmission due to a bad bushing at connection point\'], [\'these bushings\', \'them\']]'}}
I've also tried looping through each row and modifying each entity separately, but it still gives me the same error.
I'd also like to add that when I run ast.literal_eval
on a single row, it returns the appropriate value without any problem.
Edit 2
I managed to achieve what I was trying to do by running:
df['column'] = df['column'].apply(ast.literal_eval)
but unfortunately that doesn't answer my initial question of what may be causing the malformed string/node error.