I’m trying to write a function to get album data from Spotify’s API for a data frame of albums and artists. Because there are some misspellings in the dataset, I need to use a fuzzy matching function (like agrepl
).
However, some artists, like Absu, have albums that are, by agrepl
's standards, the same. For example, Absu has an album named “Absu” and another named “Abzu”. I only want the data for 1 of them, but I’ll end up with data for both. I know that you can change max.distance
in agrepl
, but I need it set fairly low to account for greater misspellings.
Is there a pre-built function or an easy way to tell R
if there is an exact match of album_name
with mydata[["Album"]]
filter and move on
else: try and find a close match to filter?
Here’s something I’ve tried, but doesn’t work:
get_album_data <- function(x) {
get_artist_audio_features(mydata$Artist[x], return_closest_artist = TRUE) %>%
ifelse(album_name %in% mydata$Album[x],
filter(mydata$Album[x] == album_name,
filter(agrepl(mydata$Album[x], album_name, ignore.case = TRUE))))
}
This is what my code looks like without trying anything special
library(dplyr)
library(spotifyr)
library(purrr)
# from Spotify's developer page
Sys.setenv(SPOTIFY_CLIENT_ID = "xxx")
Sys.setenv(SPOTIFY_CLIENT_SECRET = "xxx")
access_token <- get_spotify_access_token()
Artist <- c("Spiritualized", "Fleet Foxes", "The Avalanches", "Absu")
Album <- c("Sweet Heart, Sweet Light", "Helplessness Blues", "Wildflower", "Abzu")
mydata <- data_frame(Artist, Album)
get_album_data <- function(x) {
get_artist_audio_features(mydata[["Artist"]][x], return_closest_artist = TRUE) %>%
filter(agrepl(mydata[["Album"]][x], album_name, ignore.case = TRUE)) %>%
mutate(mydata[["Artist"]][x])
}
Any ideas? Thanks