2

Can someone help me with syntax for a function? The script works fine when not embedded in another function:

library(tidyverse)
library(rvest)
library(xml2)
library(haven)
library(labelled)

redcap1 <- structure(list(record_id = structure(c("1", "2", "3"), label = "Record ID", class = c("labelled", 
"character")), dsq_xx1_age = structure(c(45, 34, 57), label = "<div class=\"rich-text-field-label\"><p><span style=\"font-weight: normal;\">How old are you (in years)?</span></p></div>", class = c("labelled", 
"numeric")), dsq_complete = structure(c(1L, 1L, 1L), .Label = c("Incomplete", 
"Unverified", "Complete"), class = c("redcapFactor", "factor"
), redcapLabels = c("Incomplete", "Unverified", "Complete"), redcapLevels = 0:2)), row.names = c(NA, 
-3L), class = "data.frame")
if (redcap1$dsq_complete %>%
  attributes() %>%
  unlist() %>% 
  pluck("label") %>% 
  is.null() ==TRUE) { 
     redcap1$dsq_complete %>% 
     substitute() %>% 
     deparse() %>%
     str_extract("(?<=\\$).*")}

but when I try to embed in another function like this I get NA:

test <- function(x){
  if ((x) %>%
  attributes() %>% 
  unlist() %>% 
  pluck("label") %>% 
  is.null() ==TRUE) { 
     (x) %>% 
     substitute() %>% 
     deparse() %>%
     str_extract("(?<=\\$).*")
  }}

What I am trying to do is extract a column name if no attribute is assigned from an exported Redcap API call. This part is embedded within a larger function that is doing more, but this is the code that is throwing an error.

I don't know if I need to use the .x syntax for the function?? Thanks for any help.

this is the larger function:

strip_redcap_html <- function(x) {
    if ((x) %>%
  attributes() %>% 
  unlist() %>% 
  pluck("label") %>% 
  is.null()) { 
     (x) %>% 
     substitute(.) %>% 
     deparse() %>%
     str_extract("(?<=\\$).*")
      }
  
  if ((x) %>%
  attributes() %>% 
  unlist() %>% 
  pluck("label") %>% 
  str_detect("<div")) { 
    attributes(x) %>% 
  unlist() %>% 
  pluck("label") %>% 
  simplify() %>% 
  read_xml() %>% 
  html_text()
    
  }  else { (attributes(x) %>% 
       unlist() %>% 
      pluck("label"))}
}

and I plan on calling like so:

output <- redcap1 %>%
  summarise(across(everything(), ~strip_redcap_html(.)))
wdefreit
  • 51
  • 3
  • 1
    Please share a little bit of sample input and expected output. – Gregor Thomas Mar 15 '22 at 13:43
  • 1
    As a general note, `== TRUE` is almost never needed. `if(foo == TRUE)` is a long way of writing `if(foo)`. And mixing pipes with other binary operators like `==` can lead to issues with order of operations. – Gregor Thomas Mar 15 '22 at 13:44
  • Will remove the ==TRUE (thx); So for the function the input will be the redcap column name and the output here will just be the colname, but the larger function is stripping the html out and replacing with just text (still working on the function. The problem is I am running into a problem when I try to run the function on columns without the attribute label.. – wdefreit Mar 15 '22 at 13:50
  • 1
    Great, please use `dput()` or another method to share a copy/pasteable R object that we can test and debug the code on. A small data frame with 2 or 3 columns and 2 or 3 rows would be perfect. E.g. `dput(redcap[1:3, 1:2])` for the first 3 rows and 2 columns. Choose a suitable small subset to illustrate the problem. – Gregor Thomas Mar 15 '22 at 14:00
  • Thanks just added the dput() – wdefreit Mar 15 '22 at 14:12

1 Answers1

1

Thank you @GabrielOdom and @RayBalise. Also Thank you @GregorThomas for your suggestions to clean up the code. This is the solution I was able to figure out with the help of those much wiser than I:

library(tidyverse)
library(rvest)
library(xml2)
library(haven)
library(labelled)

strip_redcap_html <- function(x) {
  call_char <- as.character(match.call())
  if ((x) %>%
      attributes() %>% 
      unlist() %>% 
      pluck("label") %>% 
      is.null()) { 
    return(
      if (str_detect(call_char[2],"(?:\\$)")) {
        str_extract(call_char[2], "(?<=\\$).*")
      } else {call_char[2]} )
    
  }
  
  if ((x) %>%
      attributes() %>% 
      unlist() %>% 
      pluck("label") %>% 
      str_detect("<div")) { return(
        attributes(x) %>% 
          unlist() %>% 
          pluck("label") %>% 
          simplify() %>% 
          read_xml() %>% 
          html_text())
    
  }  else { return(attributes(x) %>% 
                     unlist() %>% 
                     pluck("label"))}
}

x <-strip_redcap_html(redcap$dsq_xx1_age)
y <-strip_redcap_html(redcap$record_id)
z <-strip_redcap_html(redcap$dsq_complete)

output <- redcap %>% 
  summarise(across(everything(), ~strip_redcap_html(.)))
wdefreit
  • 51
  • 3