5

stackoverflow community,

I have two shapefiles of the municipalities of Japan. I'm using R to create separate plots for each of the municipalities. I can make this work with one shapefile, but the identical syntax fails with the other. Here are the code and the URLs for the data:

library(sp)
library(maptools)

# Map A - this one works
# Please download: http://www.filefactory.com/file/z26nirxoz53/n/JPN_adm_zip

# Enter your path for readShapePoly
japanMapA = readShapePoly("JPN_adm/JPN_adm2")
names(japanMapA) 

plot(japanMapA[japanMapA$ID_2 == 1199,])

# Map B - this one doesn't work
# Please download: http://geocommons.com/overlays/173340.zip

# Again, enter your path for readShapePoly    
japanMapB = readShapePoly("japan_ver71") 
names(japanMapB)

plot(japanMapB[japanMapB$JCODE == 45382,])

The error it throws is:

Error in plot(japanMapB[japanMapB$JCODE == 45382, ]) : 
error in evaluating the argument 'x' in selecting a method for function 'plot': Error in japanMapB[japanMapB$JCODE == 45382, ] : 
NAs not permitted in row index

I don't know how to go about removing the NAs in this case, so I am unable to plot the individual elements.

Would greatly appreciate your help: have been banging my head against the wall for a while on this one!

chrki
  • 6,143
  • 6
  • 35
  • 55
  • `plot(japanMapB)` works, so my guess is something is funny with your selection. What is `JCODE` and why are you selecting on this variable to plot? – thelatemail Aug 19 '13 at 04:56
  • Something like: `plot(japanMapB[which(japanMapB$JCODE == 45382),])` seems to work to plot a section. – thelatemail Aug 19 '13 at 05:08

2 Answers2

6

I think this comes down to an issue with appropriate sub-setting.

Your japanMapB object consists of some metadata and a series of polygons for each shape stored in japanMapB@polygons. So, you have:

> length(japanMapB$JCODE)
#[1] 1902
> length(japanMapB@polygons)
#[1] 1902

As @PaulHiemstra notes though, you have some NA values in your JCODE variable

> table(is.na(japanMapB$JCODE))

#FALSE  TRUE 
# 1894     8 

Which means you get NA results when trying to index the municipalities you want to plot.

> table(japanMapB$JCODE==45382,useNA="always")

#FALSE  TRUE  <NA> 
# 1893     1     8 

Wrapping in which solves this:

which(japanMapB$JCODE == 45382)
#[1] 1802
#You will now select to plot only the 1802th polygon stored in the data object
plot(japanMapB[which(japanMapB$JCODE == 45382),])
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • 1
    Using `which` works perfectly. Very grateful for your detailed response, @thelatemail, and for your points too, @PaulHiemstra: I have a much better understanding of how it's working now! – spatiallyConfused Aug 19 '13 at 05:48
3

What could be happening is that there are municipalities/polygons with an NA id. For example:

id = c(1,2,3,NA)
id == 2
# [1] FALSE  TRUE FALSE    NA

Notice that with this comparison, NA values in id do not lead to FALSE, but to NA. Using id == 2 for subsetting leads to the problem you see. @thelatemail wraps the call in which, which translates the logical vector to a set of indices where the logical vector is TRUE:

which(id == 2)
# [1] 3

Notice that the call to which eliminates the NA. This vector of indices can be safely used to subset. In general, you can use na.omit to remove NA's:

l[l == 2]
# [1]  2 NA
l[na.omit(l == 2)]
# [1] 2
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • 1
    Yep - that's certainly the issue - you beat me to the punch as I was typing, but I have explained the specific situation for the OP in another answer. – thelatemail Aug 19 '13 at 05:32
  • As @thelatemail points out, yes, that was definitely the issue. Many thanks to both for isolating and solving the problem! – spatiallyConfused Aug 19 '13 at 05:54