I have a sensitive data set that should never be stored unencrypted on disk. Can R deal with this or is full disk encryption my only option?
Asked
Active
Viewed 5,335 times
13
-
2How is it encrypted?? Check out the [`PKI` package in R](http://cran.r-project.org/web/packages/PKI/PKI.pdf). – jlhoward Aug 15 '14 at 00:31
-
@jlhoward It can be encrypted in whichever way would work best with R. Thank you for the package reference. – orizon Aug 15 '14 at 03:07
-
The `system` function, or on Windows the `shell` function can be used to pass commands to utilities that have an API. – IRTFM Aug 15 '14 at 03:56
-
I came across this package called encryptr which helps in encryption, decryption and loading data. Documentation is available on: https://encrypt-r.org/ – pradeepvaranasi Feb 05 '20 at 09:54
1 Answers
16
I have a feeling there's an easier way to do this, but the digest
package, which does AES encryption, is the closest thing I came across to what you are asking for. This should get you started.
# write encrypted data frame to file
write.aes <- function(df,filename, key) {
require(digest)
zz <- textConnection("out","w")
write.csv(df,zz, row.names=F)
close(zz)
out <- paste(out,collapse="\n")
raw <- charToRaw(out)
raw <- c(raw,as.raw(rep(0,16-length(raw)%%16)))
aes <- AES(key,mode="ECB")
aes$encrypt(raw)
writeBin(aes$encrypt(raw),filename)
}
# read encypted data frame from file
read.aes <- function(filename,key) {
require(digest)
dat <- readBin(filename,"raw",n=1000)
aes <- AES(key,mode="ECB")
raw <- aes$decrypt(dat, raw=TRUE)
txt <- rawToChar(raw[raw>0])
read.csv(text=txt)
}
# sample data
set.seed(1) # for reproducible example
data <- data.frame(x=rnorm(10),y=rpois(10,1),
z=letters[1:10],w=sample(T:F,10,replace=T))
set.seed(123581321)
key <- as.raw(sample(1:32,32))
write.aes(data,"encrypted.dat",key)
result <- read.aes("encrypted.dat",key)
# did it work?
all.equal(data,result)
# [1] TRUE
This uses ECB mode AES encryption. Obviously you need to use the same key to encrypt and decrypt. write.aes(...)
converts the data frame to a csv-formatted text string, converts that to raw (which is required for AES), pads the raw vector out to a multiple of 16 bytes (also required for AES), encrypts, and writes to a binary file. read.aes(...)
basically reverses the process.
This is just an example, intended to be modified to suit your needs. For instance, this saves the data frame without row names, which might or might not be a problem.

jlhoward
- 58,004
- 7
- 97
- 140
-
2Thanks for this excellent code! You could use `dput` and `dget` rather than `write.csv` and `read.csv` to store arbitrary R objects in text format before converting them into binary format. – cryo111 Jun 21 '16 at 16:23
-
Thanks for the great example! So do I understand correct, that I have to save the key somewhere else than in the code? Because if I would save it in the code it is not really secure anymore. – schluk5 Mar 02 '18 at 20:36