1

I want to distribute a fitted gamlss model (lms) as a file. The data used to fit the gamlss model must not be published, as it is subject to privacy regulations.

However, when I use the save or saveRDS functions, the resulting files are both huge and contain the raw data. Removing individual variables, e.g. $y, from the gamlss model object is possible but does not seem like the correct way to go (and sometimes -depending on the member I drop- results in errors when I use gamlss functions on the modified model).

Is there a better way to distribute/serialize gamlss models that a) does not reveal the underlying data, b) allows use of the gamlss functions, and c) results in smaller files?

(My searches regarding this were unsuccessful, perhaps because of inadequate search terms - if so, a pointer in the right direction would be much appreciated.)

Thanks Jakob

Jakob
  • 11
  • 2
  • I think you have three options. One is to attempt to anonymise the data before running the analysis. This may not be possible depending on whether the directly identifiable characteristics are variables in the regression, but would be the best option all-round if you can. The second is to share only a pertinent summary of the gamlss - this is the standard way of publishing the results of a model. The third is to work out what you want people to be able to do with the model and write S3 methods for a stripped-down gamlss object. – Allan Cameron Sep 16 '21 at 12:07
  • Please provide enough code so others can better understand or reproduce the problem. – Community Sep 16 '21 at 19:59

0 Answers0