I am trying to extract all my scrobbles from the LastFM API ('method': 'user.getrecenttracks'), using Python.
I have been able to extract the raw data but am struggling when processing the data in DataFrames. Most fields come back with a lot of extra ID tags, which I need to strip. Example
BEFORE: More Than Ever People - Late Night Mix by {'mbid': '', '#text': 'Levitation'} from album: {'mbid': '3240770e-8cbd-49c3-a070-dc92b4ffb8fe', '#text': 'Essential Levitation - 20 years of Ibiza Chillout Music'} {'uts': '1590297990', '#text': '30 May 2020, 10:10'}
AFTER: More Than Ever People - Late Night Mix by Levitation from album: Essential Levitation - 20 years of Ibiza Chillout Music
The data is structured as follows:
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 artist 501 non-null object
1 album 501 non-null object
2 name 501 non-null object
3 date 500 non-null object
Stripping goes fine for all fields but one, the 'date field'. Basically I use the indexes row['index1']['index2'] which works fine, except for the date field. All fields more less are structured the same way, see as per Last FM API.
So addressing row['album']['#text']
works fine where as row['date'] = row['date']['#text']
errors out with "TypeError: string indices must be integers".
See code (the commented out code is the bit I am struggling with.):
for index, row in df_track_list.iterrows():
print ("pre:", row['name'], "by", row['artist'], "from album:", row['album'], row['date'])
row['artist'] = row['artist']['#text']
row['album'] = row['album']['#text']
#row['date'] = row['date']['#text']
#print(row['date']['#text'])
print ("post:", row['name'], "by", row['artist'], "from album:", row['album'])
What is happening here? Any ideas? Or anybody with working examples?