1

I'm using geotools to open gadm36.shp from gadm.org which shapefile containing worldwide administrative areas. I'm trying to get a single Geometry (e.g. org.locationtech.jts.geom.MultiPolygon) for each country. So if for example, there were 195 countries in this shapefile, I would have 195 Geometries. As a side note, I also have the GPKG file for the world so if using that is simpler, I'm happy to use that instead.

// load collection from shapefile
File file = new File("<PATH>/gadm36.shp");
Map<String, Object> map = new HashMap<>();
map.put("url", file.toURI().toURL());
DataStore dataStore = DataStoreFinder.getDataStore(map);
String typeName = dataStore.getTypeNames()[0];
FeatureSource<SimpleFeatureType, SimpleFeature> source = dataStore.getFeatureSource(typeName);

FeatureCollection<SimpleFeatureType, SimpleFeature> collection = source.getFeatures(Filter.INCLUDE);

That last line collection has a size of 339127. It seems to contain every sate, country, town, village, etc. How do I get a smaller list of just the countries?

J'e
  • 3,014
  • 4
  • 31
  • 55
  • Did you ever figure it out? I'm at the same spot. I just want the countries, but the metadata has "republic" and dozens of other variations. – Mastiff Jul 13 '22 at 19:15
  • @Mastiff sortof. Merging them into a smaller list just made each polygon more complicated and didn't help with lookup take a long time. My solution was to create a quad tree where each tile maps to all the countries that intersect its boundary. Each "Tile" has 4 children tiles [up-right, up-left, lower-right, lower-left] where each child is another instance of the Tile class. Using a lon/lat to find the country name went from about 5 seconds to about 5 mSec. – J'e Jul 13 '22 at 19:56
  • Ok thanks. I was assuming the initial giant GADM contained thousands of small polygons representing states and counties and who-knows-what. Sounds like this is not the case, and that the country boundaries themselves are just complex? – Mastiff Jul 13 '22 at 20:51

1 Answers1

0

Instead of Filter.INCLUDE you need to provide a filter that selects the features that you want. You'll need to play about a bit but if you just want countries then something like ENGTYPE_1 = 'Kingdom' or ENGTYPE_1 = 'Country' might do it.

The easiest way to build a filter is to add the gt-cql module to your project and use

Filter f = ECQL.toFilter("ENGTYPE_1 = 'Kingdom' or ENGTYPE_1 = 'Country'");
FeatureCollection<SimpleFeatureType, SimpleFeature> collection = source.getFeatures(f);

I'd also recommend using the geopackage rather than the shapefile as your filter will be passed to the database and you'll get the features back more quickly.

And as a final style note URLs.fileToUrl(file) is preferred over file.toURI().toURL() as it handles windows files and spaces much better.

Ian Turton
  • 10,018
  • 1
  • 28
  • 47
  • GADM doesn't contain a ENGTYPE_1 of Country. Using Kingdom selects all 300+ UK records and a few for Wallis and Futuna. Thanks for the post anyway because I did find it useful. – J'e Jan 11 '22 at 19:28
  • I guess there is an attribute that will allow you to filter out the boundaries that you need, but I couldn't find any metadata on GADM. – Ian Turton Jan 12 '22 at 09:07