1

I'm writing a python script to iterate through new music editorial playlists with the spotify api to pull track, artist, album information into a csv file. My script worked great for a while, perfectly executing through all tracks on the playlists in my list of ids, but stopped while processing a track after about an hour of run-time. I thought this might have to do with my access token expiring, so I added some code towards the beginning of my script to get the cached access token info and refresh it each new run thinking this would re-initiate at least a new hour of run-time so I could dive deeper to see if/where I need to add an automatic refresh while the data pulling is iterating if my access token expires in the future. For whatever reason my script isn't retrieving a bad request or token expire error to the console it is simply just getting stuck while processing the first track on the first playlist as you can see in the screenshot below. For context, while it was working the console was printing every track in the same format from all playlist ids in my list and then it got stuck in the middle of a single playlistid as it is now, but now it is getting stuck at the very first track on the first playlist. I am almost certain this is some sort of issue with my access token, I guess my question is why is it getting stuck and not throwing an error, and how can I fix this so it automatically refreshes properly to continue running without exiting execution early. Thanks!

current output, stuck processing 1st track like said above. before, was running through many tracks and iterating properly through playlists until getting stuck similarly here, just now only does 1 track.


import csv
from datetime import datetime, timedelta
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import time

# Set up credentials and authorization parameters
client_id = 'myclientid'
client_secret = 'myclientsecret'
redirect_uri = 'https://google.com/'
scope = 'playlist-modify-public playlist-modify-private'
username = 'myusername'

# Create Spotipy object using SpotifyOAuth
sp = spotipy.Spotify(
    auth_manager=SpotifyOAuth(
        client_id=client_id,
        client_secret=client_secret,
        redirect_uri=redirect_uri,
        scope=scope,
        username=username
    )
)

# Refresh access token **This is what I added after realizing program was getting stuck after about an hour of run-time**

token_info = sp.auth_manager.get_cached_token()
sp.auth_manager.refresh_access_token(token_info['refresh_token'])
print("Refreshing Access Token.")
new_token_info = sp.auth_manager.get_cached_token()
print("Old access token:", token_info['access_token'])
print("New access token:", new_token_info['access_token'])

# Define a list of playlist IDs
playlist_ids = ['37i9dQZF1DX4JAvHpjipBk', '37i9dQZF1DX0XUsuxWHRQd', '37i9dQZF1DXdwmD5Q7Gxah', '37i9dQZF1DXcBWIGoYBM5M', '37i9dQZF1DX10zKzsJ2jva', '37i9dQZF1DWY7IeIP1cdjF', '37i9dQZF1DX76Wlfdnj7AP', '37i9dQZF1DX0FOF1IUWK1W', '37i9dQZF1DX1lVhptIYRda', '37i9dQZF1DXdSjVZQzv2tl', '37i9dQZF1DX4sWSpwq3LiO', '37i9dQZF1DWY4xHQp97fN6', '37i9dQZF1DWZjqjZMudx9T', '37i9dQZF1DX4SBhb3fqCJd', '37i9dQZF1DX4dyzvuaRJ0n', '37i9dQZF1DWTkIwO2HDifB', '37i9dQZF1DWWQRwui0ExPn', '37i9dQZF1DXaXB8fQg7xif', '37i9dQZF1DX5BAPG29mHS8', '37i9dQZF1DWZd79rJ6a7lp', '37i9dQZF1DXcZQSjptOQtk', '37i9dQZF1DXcF6B6QPhFDv', '37i9dQZF1DX9tPFwDMOaN1', '37i9dQZF1DWWY64wDtewQt', '37i9dQZF1DX0BcQWzuB7ZO', '37i9dQZF1DXcZDD7cfEKhW', '37i9dQZF1DWYBO1MoTDhZI', '37i9dQZF1DXbbu94YBG7Ye', '37i9dQZF1DXb0COFso7q0D', '37i9dQZF1DWY4lFlS4Pnso', '37i9dQZF1DWUa8ZRTfalHk', '37i9dQZF1DXaxEKcoCdWHD', '37i9dQZF1DWSpF87bP6JSF', '37i9dQZF1DX6GwdWRQMQpq']

tracksData = []

# Iterate through playlist IDs and extract track information
for playlist_id in playlist_ids:
    # Use Spotipy API to get playlist data
    playlist = sp.playlist(playlist_id)

    # Use Spotipy API to get track data
    results = sp.playlist_tracks(playlist_id)

    count = 1

    # Extract track information and add to tracksData array
    for track in results['items']:
        track = track['track']
        print(f"Processing track: {track['artists'][0]['name']} - {track['name']} from playlist: {playlist['name']}")
        start_time = time.time()
        try:
            sp.artist(track['artists'][0]['id'])
            sp.track(track['id'])
            sp.album(track['album']['id'])
        except:
            pass
        elapsed_time = time.time() - start_time
        if elapsed_time > 3:
            print(f"Skipping track: {track['artists'][0]['name']} - {track['name']} from playlist: {playlist['name']} (took too long to process)")
            continue
        tracksData.append({
            'artistName': track['artists'][0]['name'],
            'songName': track['name'],
            'releaseDate': track['album']['release_date'],
            'positionInPlaylist': count,
            'artistFollowers': sp.artist(track['artists'][0]['id'])['followers']['total'],
            'albumImageUrl': track['album']['images'][0]['url'],
            'trackPopularity': track['popularity'],
            'artistPopularity': sp.artist(track['artists'][0]['id'])['popularity'],
            'isrc': track['external_ids']['isrc'],
            'albumLabel': sp.album(track["album"]["id"])["label"],
            'albumExternalUrl': track['album']['external_urls']['spotify'],
            'playlistId': playlist_id,
            'playlistName': playlist['name'], # Set playlistName to actual name of playlist
            'playlistImage': playlist['images'][0]['url'], # Add playlist image to dictionary
            'playlistFollowers': playlist['followers']['total'], # Add playlist followers to dictionary
            'trackId': track['id'], # Add track ID to dictionary
            'albumId': track['album']['id'] # Add album ID to dictionary
        })
        count += 1

    time.sleep(2) # Pause for 2 seconds before processing the next playlist

# Calculate the most recent Friday
today = datetime.today()
friday = today - timedelta((today.weekday() - 4) % 7)

# Calculate the date 7 days prior to the most recent Friday
lastWeekFriday = friday - timedelta(days=7)

# Create a list of track dictionaries with release dates within the past week
recentTracks = []

for track in tracksData:
    # Convert release date string to datetime object
    releaseDate = datetime.strptime(track['releaseDate'], '%Y-%m-%d')

    # Check if release date is within the past week
    if lastWeekFriday <= releaseDate < friday:
        recentTracks.append(track)

# Create and write track data to CSV file
with open('tracksData.csv', mode='w', newline='') as csv_file:
    fieldnames = ['artistName', 'songName', 'releaseDate', 'positionInPlaylist', 'artistFollowers', 'albumImageUrl',
                  'trackPopularity', 'artistPopularity', 'isrc', 'albumLabel', 'albumExternalUrl', 'playlistId',
                  'playlistName', 'playlistImage', 'playlistFollowers', 'trackId', 'albumId']
    writer = csv.DictWriter(csv_file, fieldnames=fieldnames)

    writer.writeheader()
    for track in recentTracks:
        writer.writerow(track)

Edit: I just tried switching auth methods to client flow since I'm not using any user scopes really in my code to see if it would make a difference. The new auth method has been incorporated into the 2nd block of code. It has not changed anything, it is still stuck. New attempt fixing using client auth method. Tried adding refresh before iterating thru playlists too and also try except argument during iterations, still stuck processing 1st track.

client_id = 'client_id'
client_secret = 'client_secret'

auth_manager = SpotifyClientCredentials(client_id=client_id, client_secret=client_secret)
sp = spotipy.Spotify(auth_manager=auth_manager)

...

def refresh_access_token():
    sp.auth_manager.get_access_token(as_dict=False, check_cache=False)

refresh_access_token()

# Iterate through playlist IDs and extract track information
for playlist_id in playlist_ids:
    # Use Spotipy API to get playlist data
    playlist = sp.playlist(playlist_id)

    # Use Spotipy API to get track data
    results = sp.playlist_tracks(playlist_id)

    count = 1

    # Extract track information and add to tracksData array
    for track in results['items']:
        track = track['track']
        print(f"Processing track: {track['artists'][0]['name']} - {track['name']} from playlist: {playlist['name']}")
        start_time = time.time()
        while True:
            try:
                tracksData.append({
                    'artistName': track['artists'][0]['name'],
                    'songName': track['name'],
                    'releaseDate': track['album']['release_date'],
                    'positionInPlaylist': count,
                    'artistFollowers': sp.artist(track['artists'][0]['id'])['followers']['total'],
                    'albumImageUrl': track['album']['images'][0]['url'],
                    'trackPopularity': track['popularity'],
                    'artistPopularity': sp.artist(track['artists'][0]['id'])['popularity'],
                    'isrc': track['external_ids']['isrc'],
                    'albumLabel': sp.album(track["album"]["id"])["label"],
                    'albumExternalUrl': track['album']['external_urls']['spotify'],
                    'playlistId': playlist_id,
                    'playlistName': playlist['name'], # Set playlistName to actual name of playlist
                    'playlistImage': playlist['images'][0]['url'], # Add playlist image to dictionary
                    'playlistFollowers': playlist['followers']['total'], # Add playlist followers to dictionary
                    'trackId': track['id'], # Add track ID to dictionary
                    'albumId': track['album']['id'] # Add album ID to dictionary
                })
                count += 1
                break
            except spotipy.exceptions.SpotifyException:
                refresh_access_token()
            except Exception as e:
                print(e)
    time.sleep(2) # Pause for 2 seconds before processing the next playlist
yaboy618
  • 127
  • 10

1 Answers1

1

The spotipy not provide auto update access -token but you can update new access-token its functionality.

The is_token_expired() can check expired access token or not.

The refresh_access_token() update an access token by input parameter of refresh token.

You don't manage one-hour time monitoring and handling.

This code will give a hint, on how to address your problem.

from datetime import datetime
import spotipy
from spotipy.oauth2 import SpotifyOAuth
import json

# Set up credentials and authorization parameters
client_id = '<your client id>'
client_secret = '<your client secret>'
redirect_uri = '<your redirect URL'
scope = 'playlist-modify-public playlist-modify-private'
username = '<your user id>'

# Create Spotipy object using SpotifyOAuth
sp = spotipy.Spotify(
    auth_manager=SpotifyOAuth(
        client_id=client_id,
        client_secret=client_secret,
        redirect_uri=redirect_uri,
        scope=scope,
        username=username
    )
)

# get token information
token_info = sp.auth_manager.get_cached_token()
refresh_token = token_info['refresh_token']

print('first access token')
print(json.dumps(token_info, indent=2))
print(datetime.fromtimestamp(int(token_info['expires_at'])))
print(sp.auth_manager.is_token_expired(token_info))

print('-------------------------------------------------')

# get new access token by refresh token
sp.auth_manager.refresh_access_token(refresh_token)
token_info = sp.auth_manager.get_cached_token()

print('Second access token')
print(json.dumps(token_info, indent=2))
print(datetime.fromtimestamp(int(token_info['expires_at'])))
print(sp.auth_manager.is_token_expired(token_info))

Result

enter image description here

I tested this demo code, to save JSON file for each playlist.

The problem was Spotify API calling limit not a token issue. I added API call calculation. Spotify allows for approximately 180 requests per minute but I got 234.5 calls /minuets it was no problem, I think it may my call within number or total call limit. I don't know exactly what is limit of total number of call. Here is information

import spotipy
from spotipy.oauth2 import SpotifyOAuth
import time
from datetime import datetime
import json
import asyncio
import os

# Set up credentials and authorization parameters
client_id = '<your client id>'
client_secret = '<your client secret>'
redirect_uri = '<your redirect uri>'
scope = 'playlist-modify-public playlist-modify-private'
username = '<your user id>'
data_directory = 'tracks'

# Create Spotipy object using SpotifyOAuth
sp = spotipy.Spotify(
    auth_manager=SpotifyOAuth(
        client_id=client_id,
        client_secret=client_secret,
        redirect_uri=redirect_uri,
        scope=scope,
        username=username
    )
)

token_info = sp.auth_manager.get_access_token()
refresh_token = token_info['refresh_token']

playlist_ids = ['37i9dQZF1DX4JAvHpjipBk', '37i9dQZF1DX0XUsuxWHRQd', '37i9dQZF1DXdwmD5Q7Gxah', '37i9dQZF1DXcBWIGoYBM5M', '37i9dQZF1DX10zKzsJ2jva', '37i9dQZF1DWY7IeIP1cdjF', '37i9dQZF1DX76Wlfdnj7AP', '37i9dQZF1DX0FOF1IUWK1W', '37i9dQZF1DX1lVhptIYRda', '37i9dQZF1DXdSjVZQzv2tl', '37i9dQZF1DX4sWSpwq3LiO', '37i9dQZF1DWY4xHQp97fN6', '37i9dQZF1DWZjqjZMudx9T', '37i9dQZF1DX4SBhb3fqCJd', '37i9dQZF1DX4dyzvuaRJ0n', '37i9dQZF1DWTkIwO2HDifB', '37i9dQZF1DWWQRwui0ExPn', '37i9dQZF1DXaXB8fQg7xif', '37i9dQZF1DX5BAPG29mHS8', '37i9dQZF1DWZd79rJ6a7lp', '37i9dQZF1DXcZQSjptOQtk', '37i9dQZF1DXcF6B6QPhFDv', '37i9dQZF1DX9tPFwDMOaN1', '37i9dQZF1DWWY64wDtewQt', '37i9dQZF1DX0BcQWzuB7ZO', '37i9dQZF1DXcZDD7cfEKhW', '37i9dQZF1DWYBO1MoTDhZI', '37i9dQZF1DXbbu94YBG7Ye', '37i9dQZF1DXb0COFso7q0D', '37i9dQZF1DWY4lFlS4Pnso', '37i9dQZF1DWUa8ZRTfalHk', '37i9dQZF1DXaxEKcoCdWHD', '37i9dQZF1DWSpF87bP6JSF', '37i9dQZF1DX6GwdWRQMQpq']

now = datetime.now()
if not os.path.exists('./{}'.format(data_directory)):
   os.makedirs('./{}'.format(data_directory))

count = 1
try:
    for playlist_id in playlist_ids:
        # Use Spotipy API to get playlist data
        playlist = sp.playlist(playlist_id)

        # Use Spotipy API to get track data
        results = sp.playlist_tracks(playlist_id)

        tracksData = []
        print(f"Started Playlist: {playlist['id']}")
        # Extract track information and add to tracksData array
        for track in results['items']:
            track = track['track']
            print(f"Processing track: {track['artists'][0]['name']} - {track['name']} from playlist - count:{count}: {playlist['name']}")
            start_time = time.time()
            tracksData.append({
                'artistName': track['artists'][0]['name'],
                'songName': track['name'],
                'releaseDate': track['album']['release_date'],
                'positionInPlaylist': count,
                'artistFollowers': sp.artist(track['artists'][0]['id'])['followers']['total'],
                'albumImageUrl': track['album']['images'][0]['url'],
                'trackPopularity': track['popularity'],
                'artistPopularity': sp.artist(track['artists'][0]['id'])['popularity'],
                'isrc': track['external_ids']['isrc'],
                'albumLabel': sp.album(track["album"]["id"])["label"],
                'albumExternalUrl': track['album']['external_urls']['spotify'],
                'playlistId': playlist_id,
                'playlistName': playlist['name'], # Set playlistName to actual name of playlist
                'playlistImage': playlist['images'][0]['url'], # Add playlist image to dictionary
                'playlistFollowers': playlist['followers']['total'], # Add playlist followers to dictionary
                'trackId': track['id'], # Add track ID to dictionary
                'albumId': track['album']['id'] # Add album ID to dictionary
            })
            count += 1
            asyncio.sleep(2)
            if(sp.auth_manager.is_token_expired(token_info)):
                sp.auth_manager.refresh_access_token(refresh_token)
                token_info = sp.auth_manager.get_cached_token()
                refresh_token = token_info['refresh_token']
        print(f"Finished Playlist: {playlist['id']}")
        json_object = json.dumps(tracksData, indent=4)
        print(json.dumps(tracksData, indent=2))
        file_name = './{}/{}.json'.format(data_directory, playlist['id'])
        with open(file_name, "w") as outfile:
            outfile.write(json_object)

except Exception as error:
    print("Exception occurred for value '"+ count + "': "+ repr(error))

later = datetime.now()
difference = (later - now).total_seconds()
minutes = difference // 60
print(f"The number of API calls: {count/minutes}")

Result Total 2344 tracks(songs) and 34 playlists without a problem.

Processing track: Spiffy The Goat - No Clappin' Shemix (Throw It) from playlist - count:2342: Feelin' Myself
Processing track: Monaleo - Body Bag from playlist - count:2343: Feelin' Myself
Processing track: Rican Da Menace - I Admit It from playlist - count:2344: Feelin' Myself

enter image description here

enter image description here

This is the last saved json file (part of it)

37i9dQZF1DXdwmD5Q7Gxah.json

[
    {
        "artistName": "d4vd",
        "songName": "WORTHLESS",
        "releaseDate": "2023-03-09",
        "positionInPlaylist": 151,
        "artistFollowers": 800225,
        "albumImageUrl": "https://i.scdn.co/image/ab67616d0000b273c158e7f083a8e87f7a5662a8",
        "trackPopularity": 64,
        "artistPopularity": 84,
        "isrc": "USUM72302840",
        "albumLabel": "Darkroom/Interscope Records",
        "albumExternalUrl": "https://open.spotify.com/album/3hNpYeCH7WOUNhXxV7AosH",
        "playlistId": "37i9dQZF1DXdwmD5Q7Gxah",
        "playlistName": "Lorem",
        "playlistImage": "https://i.scdn.co/image/ab67706f00000003346b60cf6d7b749de180c3ae",
        "playlistFollowers": 1019959,
        "trackId": "13b4mk5KeJxL0GllHLvtXQ",
        "albumId": "3hNpYeCH7WOUNhXxV7AosH"
    }
...

    {
        "artistName": "georgee",
        "songName": "sad",
        "releaseDate": "2022-09-28",
        "positionInPlaylist": 250,
        "artistFollowers": 5386,
        "albumImageUrl": "https://i.scdn.co/image/ab67616d0000b273a1fd9c8268069b6fb5c3c80e",
        "trackPopularity": 47,
        "artistPopularity": 38,
        "isrc": "QZRYT2100043",
        "albumLabel": "Good Boy Records",
        "albumExternalUrl": "https://open.spotify.com/album/6XcchJ2jRgI28zFKMUulO9",
        "playlistId": "37i9dQZF1DXdwmD5Q7Gxah",
        "playlistName": "Lorem",
        "playlistImage": "https://i.scdn.co/image/ab67706f00000003346b60cf6d7b749de180c3ae",
        "playlistFollowers": 1019959,
        "trackId": "5kIfQKgQeFcLaQ3BYvpbDI",
        "albumId": "6XcchJ2jRgI28zFKMUulO9"
    }
]

enter image description here

Bench Vue
  • 5,257
  • 2
  • 10
  • 14
  • thanks this looks like it will help for sure, it perfectly iterates through the playlists I am just wondering if the if statement further down should be somewhere else, will this refresh the access token during individual playlist track iteration or only once a playlist id is finished going to the next? i'm asking because sometimes it will get stuck midway through a playlist id. thanks again! – yaboy618 Mar 12 '23 at 21:55
  • The main idea is that in every for looping results['items'], the check token is expired or not. If not expired (within one hour), keep going job and append the track data. If expired token, get a new access token by previous the refresh token. And update the new refresh token(this case, the refresh token is the same) in the same playlist. If switch playlist, same checking to run token expire or not. So this checking do not matter switch playlist or not. It will works. – Bench Vue Mar 12 '23 at 22:26
  • my program got stuck processing again after 1 hour even with the new code. really strange. – yaboy618 Mar 13 '23 at 00:29
  • 1
    Can you update the your playlist_ids separately, I will test with it. – Bench Vue Mar 13 '23 at 00:33
  • yes, just updated. i am wondering if its a rate limit or if it is the refresh. i have another program using the spotify api that pulls in much more data though and never get rate limited. im wondering if maybe we can throughout the code do try/except arguments in various places with refresh() as a function for the except and then have duplicate nested try/except within parent try/except with continue as except in nested one. do you think this would work? – yaboy618 Mar 13 '23 at 00:38
  • 1
    Thanks, I got the full list of your playlist. I will try it. Give a time. – Bench Vue Mar 13 '23 at 00:41
  • ok thank you so much for all of the help let me know what its looking like for you. for me it got stuck processing a track but it doesnt throw errors or end the script just doesnt proceed at a certain point. – yaboy618 Mar 13 '23 at 01:11
  • 1
    @yaboy618, I update my answer. The problem was not a refresh issue, I got a holing situation (no progress without error) with no sleep() command. I try calling by Postman to get the playlist. It is the first experience but I add sleep (2) command, no error, and can get the result. Can you try it yourself and let me know your result? – Bench Vue Mar 13 '23 at 10:42
  • new answer is great. i'm still getting rate limited after about an hour. is there any way to reduce number of requests per minute? – yaboy618 Mar 13 '23 at 18:04
  • i think my only option is to request a quota extension. thanks so much for the hard work i really appreciate it. – yaboy618 Mar 13 '23 at 18:08
  • 1
    No problem, Don't forget vote me. It will give me 25 reputation points. – Bench Vue Mar 13 '23 at 18:18
  • 1
    If you still rate limit. expend sleep(3), I got 234.5 API calls/min with sleep(2). The limit is 180 calls/min. You can test it. – Bench Vue Mar 13 '23 at 18:23
  • thanks i will test it out also! i just upvoted and checked your answer, if theres anything else i can do for you let me know! – yaboy618 Mar 13 '23 at 18:42
  • OK, Thanks, no other just confirmed to me, the result of API call rate if you have a with asyncio.sleep(3) – Bench Vue Mar 13 '23 at 18:44