2

I have a .rar archive that contains a file csv. The .rar has a password and I want to read it with R Studio ( the csv is the only file into the .rar).

I tried to do it with the following code:

library(Hmisc)

getZip("datos/diarios.rar", password = "israel")

But R returned this:

A connection with                                                                                    
description "C:\\WINDOWS\\system32\\cmd.exe /c unzip -p -P israel datos/diarios.rar"
class       "pipe"                                                                  
mode        "r"                                                                     
text        "text"                                                                  
opened      "closed"                                                                
can read    "yes"                                                                   
can write   "yes" 

How can I resolve this problem?


When I run read.csv on it, it doesn´t work. Look at:

read.csv(gzfile("datos/diarios.zip", open = ""), header = T) 

Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names In addition: Warning messages: 1: In read.table(file = file, header = header, sep = sep, quote = quote, : line 1 appears to contain embedded nulls 2: In read.table(file = file, header = header, sep = sep, quote = quote, : line 2 appears to contain embedded nulls

user20650
  • 24,654
  • 5
  • 56
  • 91
  • From the help page it seems you would for example, use `read.csv` on this – user20650 Jan 19 '21 at 01:32
  • it doesn´t work. Look at: in read.csv(gzfile("datos/diarios.zip", open = ""), header = T) Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names In addition: Warning messages: 1: In read.table(file = file, header = header, sep = sep, quote = quote, : line 1 appears to contain embedded nulls 2: In read.table(file = file, header = header, sep = sep, quote = quote, : line 2 appears to contain embedded nulls – Israel Balbuena Jan 19 '21 at 02:07
  • that now seems like more of an issue with the data and read.csv rather than opening the compressed file. Unfortunately that is a bit hard to troubleshoot without the data. Are you able to open it in a text editor to have a look at it? – user20650 Jan 19 '21 at 10:03

1 Answers1

2

Suppose there is test.csv containing a data frame stored in the archive test.rar. We may open it using the command line mode of 7zip using the system() command. This is virtually a .bat file executed from R, and we just have paste together the command.

z7 <- shQuote("C:/Program Files/7-Zip/7z.exe")  ## path to yoour 7zip.exe
arch <- "V:/test.rar"  ## path to archive
temp <- tempdir()  ## creating a temporary directory
pw <- "1234"  ## provide password

Now using paste, the command could look like this

(cmd <- paste(z7, "x", arch, "-aot", paste0("-o", temp), paste0("-p", pw)))
# [1] "\"C:/Program Files/7-Zip/7z.exe\" x V:/test.rar -aot -oC:\\Users\\jay\\AppData\\Local\\Temp\\Rtmp67ZAPK -p1234"

(x: extract -aot: suffix existing file in order to not be overwritten, -o: output directory, -p: provide password)

which we'll execute with system()

system(cmd)
dat <- read.csv(paste0(temp, "/test.csv"))
dat
#   X X1 X2 X3 X4
# 1 1  1  4  7 10
# 2 2  2  5  8 11
# 3 3  3  6  9 12

unlink(temp, recursive=TRUE)  ## unlink the tempdir to clean up
jay.sf
  • 60,139
  • 8
  • 53
  • 110