0

Reading .txt data from a single file of a .zip archive works tremendously well.

testDir <- paste0(getwd(), "/xllnzoiu")
dir.create(testDir)
write.table(mtcars, file=paste0(testDir, "/test.txt"))
zip(paste0(testDir, "/testZip"), paste0(testDir, "/", "test.txt"), 
    flags="-j")
r <- read.table(unz(paste0(testDir, "/testZip.zip"), "test.txt"))
head(r, 3)
#                mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4     21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710    22.8   4  108  93 3.85 2.320 18.61  1  1    4    1

This also works with .csv files.

But when I do exactly the same with .dta (Stata data format), it fails.

foreign::write.dta(mtcars, file=paste0(testDir, "/test.dta"))
zip(paste0(testDir, "/testZip"), paste0(testDir, "/", "test.dta"), 
    flags="-j")
foreign::read.dta(unz(paste0(testDir, "/testZip.zip"), "test.dta"))
# Error in file.exists(file) : invalid 'file' argument
readstata13::read.dta13(unz(paste0(testDir, "/testZip.zip"), "test.dta"))
# Error in file.exists(file) : invalid 'file' argument
rio::import(unz(paste0(testDir, "/testZip.zip"), "test.dta"))
# Error in file.exists(file) : invalid 'file' argument

How can I read a single .dta file from a zip archive using unz()?

Edit

I've figured out that unzip() works.

r <- foreign::read.dta(unzip(paste0(testDir, "/testZip.zip"), "test.dta"))
head(r, 3)
#    mpg cyl disp  hp drat    wt  qsec vs am gear carb
# 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1

So the new question is, why does unz() fail?

Community
  • 1
  • 1
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • I don't see anything in the documentation that indicates that `foreign::read.dta()` accepts a connection. – Ritchie Sacramento Nov 01 '19 at 11:23
  • 1
    To expand - `unz()` is not failing rather `foreign::read.dta()` is unable to read from a file connection which is what `unz()` creates. However, `unzip()` works because it extracts the file from the archive for it to be read. – Ritchie Sacramento Nov 01 '19 at 12:08
  • @H1 I see, probably the package inventors didn't care about connections. – jay.sf Nov 01 '19 at 12:57

0 Answers0