4

I'm working in R 3.6 within a third-party environment (IBM Watson Studio) and need to be able to convert a model object into a raw binary or string buffer in order to save the model as indicated here. Briefly, I have to use a function that makes an API request that sends over the raw binary/string buffer data in order to save files. Similar to this question, I opted for converting the model into JSON using jsonlite::serializeJSON() (and then to raw). It seemed to work, but similar to the person in that post, I run into the issue of serializeJSON() not being able to convert complex lists in the model object appropriately. When I attempt to predict, it hits the "could not find function "list"" error. Since I have a GLM, even after implementing the suggested fix, I now encounter the error "object 'C_logit_linkinv' not found". Clearly, this method will not be very generalizable to varying models.

I'm posting as a new question because none of the solutions work in my case, and because I'd like to see if there's perhaps a more robust way of converting model objects to raw bytes or string buffers. I'm not posting to IBM boards because I felt this was a more general issue about model objects and conversion, but will do so if community feels that's more appropriate. Here's a reproducible example:

require(jsonlite)

# Generate some data
data <- data.frame(x1=rnorm(100),
                   y=rbinom(100,1,.6))

# Fit and convert model->JSON->Raw
fitted_model <- glm(y ~ 0 + ., 
                   data = data, 
                   family = binomial)
model_as_json <- jsonlite::serializeJSON(fitted_model)
model_as_raw <- charToRaw(model_as_json)

# Convert back to model
back_to_json <- rawToChar(model_as_raw)
back_to_model <- jsonlite::unserializeJSON(back_to_json)

# Score
scoring_data <- data.frame(x1=rnorm(5))
predict(object=back_to_model,
        newdata = scoring_data,
        type='response')

Specs:

  • R version 3.6.1 (2019-07-05)
  • Platform: x86_64-conda_cos6-linux-gnu (64-bit)
  • Running under: Red Hat Enterprise Linux 8.2 (Ootpa)
  • jsonlite_1.7.1
ricniet
  • 115
  • 6
  • You should use the base function `saveRDS` for a binary save. For a text save, `dput` with `control = "exact"` will come close. To read them use `readRDS` or `dget` respectively. – user2554330 Oct 14 '21 at 00:02
  • 1
    But doesn't `saveRDS` require saving to an actual file? I'd need a binary stream to serialize in-memory. I hope I'm making sense... I'll give `dput` a look. Thanks @user2554330! – ricniet Oct 14 '21 at 01:18
  • 2
    @RitchieSacramento it works! Thanks so much. Do you want to answer the question instead so I can accept it? If I don't hear back, I'll answer it myself and give you credit. :) – ricniet Oct 15 '21 at 02:51

1 Answers1

3

You can use serialize() with the connection argument set to NULL which returns a raw vector and unserialize() to restore.

Importantly, as per the documentation:

Sharing of reference objects is preserved within the object

set.seed(9)
data <- data.frame(x1=rnorm(100),
                   y=rbinom(100,1,.6))

# Fit and convert model -> Raw
fitted_model <- glm(y ~ 0 + ., 
                    data = data, 
                    family = binomial)

model_as_raw <- serialize(fitted_model, connection = NULL)

# Convert back to model
back_to_model <- unserialize(model_as_raw)

# Check that predict works
scoring_data <- data.frame(x1=rnorm(5))

predict(object=back_to_model,
        newdata = scoring_data,
        type='response')

        1         2         3         4         5 
0.4908404 0.4871576 0.4955416 0.4978725 0.5065067 
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56