This might be a hard question to field because I can't easily offer a reproducible SNAFU without providing my access token to the Gmail API, but my hope is that I am tripping over an issue that will be clear enough from my description. We'll see...
I wrote a function that takes the gmail message ID of a Google Scholar Alert and parses it into a data frame with a columns for article title, authors, publication, date, etc. My conundrum is that the same code that works when I load it interactively (i.e., load the function into session RAM a la "my_fancy_function <- function(arguments){blah blah})" does not work when I load the function as part of an R package (a la 'devtools::load_all("/mypackage/")'). It's worth noting that the other main function in my package works fine either way, but this one trips up when I try to use it from the package version after loading the package via devtools::load_all("my_misbehaving_package/").
Below is the .R file content of the function that won't come to heel when loaded as part of a package -- and I hereby acknowledge in advance that it's some ugly mess of code, if that will spare me some finger wagging in your answers. This feels like the package version of the classic "strings as factors" SNAFU, but you tell me. The problem seems to occur very early in the function evaluation, and the error I get reads as follows:
Error in UseMethod("read_xml") : no applicable method for 'read_xml' applied to an object of class "NULL"
And here is my whole sick code for this function:
library(stringr)
library(rvest)
library(plyr)
library(dplyr)
library(lubridate)
library(gmailr)
GScholar_alert_msg_to_df <- function(message_id){
one_message <- message(message_id)
msg_html <- body(one_message)
title <- read_html(msg_html) %>% html_nodes("h3") %>% html_text()
link <- read_html(msg_html) %>% html_nodes("h3 a") %>% html_attr("href")
msg_chunks <- msg_html %>% str_split("<a href") %>% unlist
msg_chunks <- msg_chunks[2:(length(msg_chunks)-2)]
excerpt <- msg_chunks %>% str_replace_all(fixed("<b>"), "") %>%
str_replace_all(fixed("</b>"), "") %>%
str_replace_all(fixed("<br>"), "") %>%
str_extract("<(font size=2 color=#222222|font size=\"-1\")>(.*?)</font>") %>%
unlist %>% str_replace_all("<(font size=2 color=#222222||font size=\"-1\")>", "") %>%
str_replace_all("</font>", "")
author_pub_field <- msg_chunks %>% str_replace_all(fixed("<b>"), "") %>%
str_replace_all(fixed("</b>"), "") %>%
str_extract("<font size=(2|\"-1\") color=#(006621|009933|008000)>(.*?)</font>") %>%
unlist %>%
str_replace_all("<font size=(2|\"-1\") color=#(006621|009933|008000)>", "") %>%
str_replace_all("</font>", "")
one_message_df <- data.frame(title, excerpt, link, author_pub_field, stringsAsFactors = FALSE)
one_message_df$date <- date(one_message) # needs reformatting
one_message_df$date %>% str_replace("[[:alpha:]]{3}, ", "") %>%
str_extract("^.{11}") %>% dmy -> one_message_df$date
one_message_df$author_only <- str_detect(one_message_df$author_pub_field, " - ")
one_message_df$author <- one_message_df$author_pub_field %>%
str_extract("^(.*?) - ") %>% str_replace(" - ", "")
one_message_df$author <- ifelse(one_message_df$author_only == 1, one_message_df$author, one_message_df$author_pub_field)
one_message_df$publication <- one_message_df$author_pub_field %>%
str_extract(" - (.*?)$") %>% str_replace(" - ", "") %>%
str_replace(", [0-9]{4}$", "") %>% str_replace("^[0-9]{4}$", NA)
one_message_df$publication <- str_replace(one_message_df$publication, "^[0-9]{4}$", NA)
one_message_df$author_MIAs <- str_detect(one_message_df$author, "…")
one_message_df$author %>% str_replace("…", " \\.\\.\\.") -> one_message_df$author
one_message_df$pub_name_clipped <- one_message_df$publication %>% str_detect("…")
one_message_df$publication %>% str_replace("…", " \\.\\.\\.") -> one_message_df$publication
return(one_message_df)
Yeah, it's ugly, but again, I promise it works when the code is entered interactively. As for the misbehaving package version, according to RStudio traceback the error seems to happen way up top, I think either the body() or the message() function I'm using from the gmailr package. (Yes, I've tried the code via Terminal to no happier conclusion.) Help me, oh be anyone, you're my only hope.