0

I am looping through an API to match strings and standardize data as per my own reference data-set. In most cases, the API gives a response and the results are populated in an output file. However, when the API returns a NULL the loop stops and I need to remove the particular string for it to function again. This is a severely iterative process. Is there any way to

  1. Find Strings where the API will return NULL? Such strings can be fixed in our data
  2. Populate NULL or NA in the output file for strings returning NULL

I cannot share the API as it has been developed internally in the organization but will share the code.

DESTINATIONS<- subset(DESTINATIONS, DESTINATIONS!="ABCDEF")


df <- data.frame()

for(i in 1:nrow(DESTINATIONS))
{
  
  location_url <- paste0(base_url, "destinations?name=", DESTINATIONS(DESTINATIONS))[i],specs)
  
  
  
destination_res <- GET(location_url)
   
destination_text <- content(destination_res, "text", encoding = "UTF-8")
  
location_df1 <- fromJSON(destination_text, flatten = TRUE)
  
location_df1 <- do.call(c, unlist(location_df1, recursive=FALSE))
  
location_df1 <- as.data.frame(t(location_df1))

coordinates_a <- select(location_df1, contains("items.country.name"))
  
coordinates_a <- coordinates_a %>% distinct() %>% t()
  
coordinates_a <- as.data.frame(coordinates_a)
  
coordinates_b <- select(location_df1, contains("items.id"))
  
coordinates_b <- coordinates_b %>% distinct() %>% t()
  
coordinates_b <- as.data.frame(coordinates_b)

coordinates <-  cbind.data.frame(coordinates_a, coordinates_b)

df <- rbind.data.frame(df, coordinates)}

In short, if a string from DESTINATIONS dataframe does not have a response from the API, the loop breaks

Thank you for the all the help in advance.

marine8115
  • 588
  • 3
  • 22
  • When you say "breaks", do you mean "throws an error"? If so, `tryCatch()` will probably be your solution. – Limey Jul 22 '20 at 07:49
  • By “breaks” I mean it processes all the strings before that one and stores them in the output file. Then I remove this string and start processing strings after that. – marine8115 Jul 22 '20 at 07:51
  • That's not what I asked. How do you know the code has "broken"? Do you get an error or warning message? You don't check the status of the response to your `GET`. If your API is well-behaved, that should help you. Etc, etc. If you want us to help you, you need to provide information that is complete and precise. What we have now is neither. – Limey Jul 22 '20 at 07:54
  • ```Error in `$<-.data.frame`(`*tmp*`, "DEST", value = "ABCDEF") : replacement has 1 row, data has 0``` This is the error I am getting – marine8115 Jul 22 '20 at 07:56
  • So now you have one possible solution. Find the line that throws the error (Use multiple `print` statements if you have to, and test for a data frame with zero rows. Also, you REALLY should test the status code of your response before you do anything else. If you had done so, and your API is well written, that almost certainly would have avoided the problem in the first place. – Limey Jul 22 '20 at 08:00
  • The response is a NULL meaning it does not find anything with exit code 200. This is fine. Plus, we are looking for a solution in R and not something to be fixed in the API. That API is being used for multiple purposes and is extremely well written. I guess this is not a platform for rants, more for sharing knowledge if one has the capability. – marine8115 Jul 22 '20 at 08:16

1 Answers1

1

Assuming the GET request would return NULL you can check for it's length.

Try using something like this :

data_list <- vector('list', nrow(DESTINATIONS))

for(i in 1:nrow(DESTINATIONS)) {
   location_url <- paste0(base_url, "destinations?name=", 
                          DESTINATIONS(DESTINATIONS))[i],specs)
   destination_res <- GET(location_url)
   if(length(destination_res) > 0) {
      destination_text <- content(destination_res, "text", encoding = "UTF-8")
      location_df1 <- fromJSON(destination_text, flatten = TRUE)
      location_df1 <- do.call(c, unlist(location_df1, recursive=FALSE))
      location_df1 <- as.data.frame(t(location_df1))
      coordinates_a <- select(location_df1, contains("items.country.name"))
      coordinates_a <- coordinates_a %>% distinct() %>% t()
      coordinates_a <- as.data.frame(coordinates_a)
      coordinates_b <- select(location_df1, contains("items.id"))
      coordinates_b <- coordinates_b %>% distinct() %>% t()
      coordinates_b <- as.data.frame(coordinates_b)
      coordinates <-  cbind.data.frame(coordinates_a, coordinates_b)
     }
   else coordinates <- NA
   data_lst[[i]] <- coordinates
}

Now, the requests which failed can be found by :

which(is.na(data_lst))

and to bind the data together you can remove those NA values.

complete_data <- do.call(rbind, Filter(is.data.frame, data_lst))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • The loop works well. But i get the same error when I try to add a column with the original string. i am adding ```coordinates$DEST <- DESTINATIONS$DESTINATIONS[i]```. I tried for 100 strings, got results for 90 which seems to be correct. Also, ```which(is.na(data_list))``` gives me ```integer(0)``` – marine8115 Jul 22 '20 at 11:50
  • Hmmm....sorry I don't understand why/how you could be getting this error. This is what I could come up with in absence of a reproducible example. – Ronak Shah Jul 22 '20 at 12:10
  • In order that loop to not stop, I added an IF statement specific to the output of my API by checking the pattern. As this is not a generic answer, I do not know if I should be posting it. – marine8115 Jul 23 '20 at 05:02
  • If it solved your problem you should definitely post it. – Ronak Shah Jul 23 '20 at 05:16