0

Right now I have the following, very slow but working code:

crossover_list = {}
    for song_id in song_ids:
        crossover_set = list(dance_occurrences.filter(
            song_id=song_id).values_list('dance_name_id', flat=True).distinct())
        crossover_list[song_id] = crossover_set

It returns a dictionary where a song ID is used as a dictionary key, and a list of integer values is used as the value. The first three keys are the following:

crossover_list = {
        1:[38,37],
        2:[38],
        ....
}

Does anyone here know of a succinct way to wrap this up into a single query? The data exists in a single table that has three columns where each song_id can be associated with multiple dance_ids.

song_id | playlist_id | dance_id
      1             1         38
      1             2         37
      2             1         38

Ideally, what I am trying to figure out how to return is:

<QuerySet[{'song_id':1, [{'dance_id':38, 'dance_id':37}]}, {'song_id':2, [{'dance_id':38}]}]>

Any ideas or help is appreciated.

Edit: As requested, here are the models in question:

# This model helps us do analysis on music/dance crossover density, song popularity within-genre,
# playlist viability within-genre (based on song occurrence counts per song within each playlist), etc.
class SongOccurrences(models.Model):
    song = models.ForeignKey(
        'Songs',
        on_delete=models.CASCADE,
    )
    playlist = models.ForeignKey(
        'Playlists',
        on_delete=models.CASCADE,
    )

    dance_name = models.ForeignKey(
        'DanceMasterTable',
        on_delete=models.CASCADE,
    )

    class Meta:
        constraints = [
            models.UniqueConstraint(fields=['song', 'playlist'], name='unique occurrence')
        ]

# This model contains relevant data from Spotify about each playlist
class Playlists(models.Model):
    spotify_playlist_uri = models.CharField(max_length=200, unique=True)
    spotify_playlist_owner = models.CharField(max_length=200)
    spotify_playlist_name = models.CharField(max_length=200)
    current_song_count = models.IntegerField()
    previous_song_count = models.IntegerField()
    dance_name = models.ForeignKey(
        'DanceMasterTable',
        on_delete=models.CASCADE,
    )

# This model contains all relavent data from Spotify about each song
class Songs(models.Model):
    title = models.CharField(max_length=200)
    first_artist = models.CharField(max_length=200)
    all_artists = models.CharField(max_length=200)
    album = models.CharField(max_length=200)
    release_date = models.DateField('Release Date', blank=True)
    genres = models.CharField(max_length=1000, blank=True)
    popularity = models.FloatField(blank=True)  # This value changes often
    explicit = models.BooleanField(blank=True)
    uri = models.CharField(max_length=200, unique=True)
    tempo = models.FloatField(blank=True)
    time_signature = models.IntegerField()
    energy = models.FloatField(blank=True)
    danceability = models.FloatField(blank=True)
    duration_ms = models.IntegerField()
    tonic = models.IntegerField(blank=True)
    mode = models.IntegerField(blank=True)
    acousticness = models.FloatField(blank=True)
    instrumentalness = models.FloatField(blank=True)
    liveness = models.FloatField(blank=True)
    loudness = models.FloatField(blank=True)
    speechiness = models.FloatField(blank=True)
    valence = models.FloatField(blank=True)

    class Meta:
        constraints = [
            models.UniqueConstraint(fields=['title', 'first_artist', 'all_artists'], name='unique song')
        ]

# This model contains the (static) master list of partner dances to be analyzed.
class DanceMasterTable(models.Model):
    dance_name = models.CharField(max_length=200, unique=True)
Virgil
  • 3
  • 5

1 Answers1

0

You are running queries inside a loop and hence, it's slow.

You can filter dance_occurrences by all the song_ids before hand and finally loop over the values to append dance ids to their respective song ids.

Example:

song_dance_occurrences = dance_occurrences.filter(
    song_id__in=song_ids
).values_list('song_id', 'dance_id').distinct()

crossover_dict = {}
for song_id, dance_id in song_dance_occurrences:
    crossover_dict[song_id] = crossover_dict.get(song_id, [])
    crossover_dict[song_id].append(dance_id)
Sachin
  • 3,576
  • 1
  • 15
  • 24
  • This helped! I tried to up-vote your reply, but I don't have a enough reputation points for Stackoverflow to register it. The only thing I had to add was .distinct() to the song_dance_occurrences query to get it to give the exact output I was looking for. Thank you! – Virgil Dec 06 '19 at 22:19
  • Updated the answer. You can accept the answer, if it worked. – Sachin Dec 07 '19 at 05:48