9

I have some very large shapefiles. I can read them into SpatialPolygonsDataFrame's using the rgdal function readOGR, but it takes a very long time for each file. I am actually only interested in the data.frame that shows up in the @data slot. Is there any way to read just the data, skipping the resource intensive polygons?

Example code:

## State of Alabama census blocks (152 MB compressed, 266 MB uncompressed)
shpurl <- "http://www2.census.gov/geo/tiger/TIGER2011/TABBLOCK/tl_2011_01_tabblock.zip"
tmp    <- tempfile(fileext=".zip")
download.file(shpurl, destfile=tmp)
unzip(tmp, exdir=getwd())

## Read shapefile
nm  <- strsplit(basename(shpurl), "\\.")[[1]][1]
lyr <- readOGR(dsn=getwd(), layer=nm)

## Data I want
head(lyr@data)
attitude_stool
  • 1,023
  • 1
  • 13
  • 18
  • 1
    Did you read thru the source code for `readOGR` ? It might well indicate either separate reads from the original file for different pieces of data, or that there is no such. – Carl Witthoft Nov 14 '12 at 17:47
  • Taking a peek at the Wikipedia page -- if you can determine which of the actual files (.shp,.atx,.sbn, etc.) contain the `@data` you want, it may be easier to roll your own function to read directly from that file. – Carl Witthoft Nov 14 '12 at 18:13
  • I would forget to do the obvious thing before asking. Yes, there is a separate call to create the `data.frame` for the `@data` slot. – attitude_stool Nov 14 '12 at 18:13
  • fwiw, I'm pretty sure you could do this with OGR itself, but readOGR requires that you get the geometry as well – mdsumner Nov 21 '12 at 00:36

2 Answers2

8

Shapefiles are compound files that store their attribute data in a file with extension *.dbf. (See the Wikipedia shapefile article for a reference.) The dbf suffix refers to the dBase file format, which can be read by the function read.dbf() in the foreign package.

So, try this:

library(foreign)
df <- read.dbf("tl_2011_01_tabblock.dbf")
## And, more generally, read.dbf("path/to/shapefile/shapefile-name.dbf")
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
0
    FileInputStream fis = new FileInputStream( "/folder/file.dbf" ); 
DbaseFileReader dbfReader =  new DbaseFileReader(fis.getChannel(),false, Charset.forName("ISO-8859-1"));

        while ( dbfReader.hasNext() )   {
          final Object[] fields = dbfReader.readEntry();

          Long field1 = (Long) fields[0];
          Long field2 = (Long) fields[1];
          System.out.println("DBF field "+i+" value is: " + fields[0]);
          System.out.println("DBF field 2 value is: " + field2);
        }
 dbfReader.close();
 fis.close();
mwendamseke
  • 269
  • 2
  • 5