I want to convert a .sas7bdat file to a .csv/txt format so that I can upload it into a hive table. I'm receiving the .sas7bdat file from an outside server and do not have SAS on my machine.
-
What have you done so far? – matsjoyce Oct 23 '14 at 16:23
-
It's very difficult to retrieve the data from a sas7bdat file without having SAS installed on your machine. Can you get the data in a different format, or transfer it to a computer or server that does have SAS installed? – mjsqu Oct 23 '14 at 16:30
-
This isn't possible without a tool of some sort. SAS7BDAT is a closed format, and only a few people have reverse engineered it. – Joe Oct 23 '14 at 16:58
5 Answers
Use one of the R foreign packages to read the file and then convert to CSV with that tool.
http://cran.r-project.org/doc/manuals/R-data.pdf Pg 12
Using the SAS7BDAT package instead. It appears to ignore custom formatted, reading the underlying data.
In SAS:
proc format;
value agegrp
low - 12 = 'Pre Teen'
13 -15 = 'Teen'
16 - high = 'Driver';
run;
libname test 'Z:\Consulting\SAS Programs';
data test.class;
set sashelp.class;
age2=age;
format age2 agegrp.;
run;
In R:
install.packages(sas7bdat)
library(sas7bdat)
x<-read.sas7bdat("class.sas7bdat", debug=TRUE)
x

- 20,510
- 4
- 21
- 38
-
What happens to custom-formatted variables in the imported SAS dataset when using this approach? Does R just see the underlying values? – user667489 Oct 23 '14 at 20:59
-
-
6https://github.com/hadley/haven is now a much faster alternative to sas7bdat package – Saurfang Sep 20 '15 at 17:10
The python package sas7bdat
, available here, includes a library for reading sas7bdat files:
from sas7bdat import SAS7BDAT
with SAS7BDAT('foo.sas7bdat') as f:
for row in f:
print row
and a command-line program requiring no programming
$ sas7bdat_to_csv in.sas7bdat out.csv

- 1,041
- 1
- 16
- 24
-
The dtype information is lost with this (the metadata/header like type, length, label, etc.). All numbers show up as floats. – Raj Jun 22 '18 at 18:26
I recently wrote this package that allows you convert sas7bdat to csv using Hadoop/Spark. It's able to split giant sas7bdat file thus achieving high parallelism. The parsing also uses parso as suggested by @Ashpreet

- 685
- 7
- 14
If this is a one-off, you can download the SAS system viewer for free from here (after registering for an account, which is also free):
http://support.sas.com/downloads/package.htm?pid=176
You can then open the sas dataset using the viewer and save it as a csv file. There is no CLI as far as I can tell, but if you really wanted to you could probably write an autohotkey script or similar to convert SAS datasets to csv.
It is also possible to use the SAS provider for OLE DB to read SAS datasets without actually having SAS installed, and that's available here:
http://support.sas.com/downloads/browse.htm?fil=0&cat=64
However, this is rather complicated - some documentation is available here if you want to get an idea:
http://support.sas.com/documentation/cdl/en/oledbpr/59558/PDF/default/oledbpr.pdf

- 9,501
- 2
- 24
- 35
-
Here is a description on how to view data using powershell, so I would think it is possible to use the same approach to export to CSV: http://blogs.sas.com/content/sasdummy/2012/04/12/build-your-own-sas-data-set-viewer-using-powershell/ – Stig Eide Oct 24 '14 at 11:23
-
Thanks, this helped me test the data, as I could not originally view the sas file – Ashpreet Bedi Nov 10 '14 at 17:45
Thanks for your help. I ended us using the parso utility in java and it worked like a charm. The utility returns the rows as object arrays which i wrote into a text file.
I referred to the utility from: http://lifescience.opensource.epam.com/parso.html

- 5,000
- 3
- 34
- 62

- 141
- 1
- 1
- 5