1

Installed packages

  • datashader 0.13.0
  • holoviews 1.14.4
  • geoviews 1.9.1
  • bokeh 2.3.2

What I'm trying to do

I'm trying to recreate a choropleth map with one color mapped to one category in a large GeoDataFrame using Datashader, following this example in the Pipeline page and this as well as this SO, which all differ slightly in their syntax, and all use points as the example, rather than polygons.

Reproducible code sample

Below a small sample of the full dataset.

d = {
    'geometry': {
        0: 'POLYGON ((13.80961103741604 51.04076975651729, 13.80965521888065 51.04079016168103, 13.80963851766593 51.04080454197601, 13.80959433642561 51.04078412781548, 13.80961103741604 51.04076975651729))',
        1729: 'POLYGON ((13.80839606906416 51.03845025070634, 13.80827635138927 51.03836030644977, 13.80840483855695 51.03829244374037, 13.80852462026795 51.03838211873356, 13.80839606906416 51.03845025070634))',
        2646: 'POLYGON ((13.80894179055831 51.04544128170094, 13.80952887156242 51.0450399782091, 13.80954152432486 51.04504668985658, 13.80896834397535 51.04545611172818, 13.80894179055831 51.04544128170094))'
    },
    'category': {
        0: 'Within_500m',
        1729: 'Outside_500m',
        2646: 'River/stream'
    }
}

gdf = gpd.GeoDataFrame(pd.DataFrame(d), geometry=gpd.GeoSeries.from_wkt(pd.DataFrame(d)['geometry']), crs="EPSG:4326")

gdf['category'] = gdf['category'].astype('category')

spatialpdGDF = GeoDataFrame(gdf)

color_key = {'Within_500m': 'red', 'Outside_500m': 'lightgrey', 'River/stream': 'lightblue'}
canvas = ds.Canvas(plot_width=1000, plot_height=1000)
agg = canvas.polygons(spatialpdGDF, 'geometry', agg=ds.count_cat('category'))
tf.shade(agg, color_key=color_key)

Expected behaviour

I would expect all polys to be rasterized and displayed in a single color for each of the categories.

Observed behaviour

The full dataset results in an almost white image, some outlines are very faintly visible.

enter image description here

If I change the background color, some of the polys stand out more, though even the title is only faintly visible.

tf.Images(tf.set_background(tf.shade(agg, color_key=color_key, name="Custom color key"), "black"))

enter image description here

Does this have to do with Datashader calculating, as the Pipeline notebook mentions, "the transparency and color of each pixel according to each category’s contribution to that pixel"? But since each category is the sole contributor to each pixel (i.e. there is no spatial overlap with other categories in this case), why does the alpha seem to be set so low that one cannot see anything? I also tried the agg=ds.by('category') aggregator with the same result.

Incidentally, if I delete the 'category' column (which causes an "input must be numeric" error otherwise) and use GeoViews in combination with HoloViews rasterize I can visualise the polys using one color without problem, but I haven't figured out how to use this approach to plot multiple datashaded GDFs with different color mapping on the same Bokeh/or mpl plot (the usual HoloViews "overlay multiplication" does not work in that case).

import geoviews as gv
from holoviews.operation.datashader import rasterize

gv.extension('bokeh')

del gdf['category']

rasterize(gv.Polygons(gdf)).opts(cmap=['red'])
Shiva127
  • 2,413
  • 1
  • 23
  • 27
grg
  • 99
  • 6
  • I can't immediately spot why the plot is ending up so light in this case. I agree that it shouldn't be due to color mixing, if there is only one category per pixel. For the HoloViews version, try `datashade` rather than `rasterize`; categorical shading is currently only available from `datashade`. Does https://examples.pyviz.org/nyc_buildings help as a starting point? – James A. Bednar Jul 13 '21 at 13:39
  • 1
    Oh, maybe try `agg=ds.by('category', ds.any())`, which will ignore overlap in case you have some. ds.by is the same as ds.count_cat but generalized to handle any type of aggregations (not just count), with count as the default. – James A. Bednar Jul 13 '21 at 13:46
  • `agg=ds.by('category', ds.any())` was the ticket! Thanks a million! If I try `datashade(agg, color_key=color_key)` I get "'DataArray' object has no attribute 'apply'". I'll check out the NYC Buildings example as it's very close to my use case - too bad this wasn't on the datashader page (or I was too blind to see it)! – grg Jul 13 '21 at 19:45
  • 1
    Glad to hear it! In that case, what's presumably happening is that you have some shapes that overlap by some tiny amount, resulting in 1 or 2 bright pixels (count of 2 shapes that overlap that pixel) and all the rest dim pixels (count of 1 shape that overlaps that pixel). ds.any() ignores the count, but at least now you know that there is some overlap (even if it's just a couple of pixels)! I never seem to be able to build an environment with geopandas anymore, or I'd try out your other issue. Yes, the nyc_buildings example is new and we'd love to see a PR on datashader's docs to add it! – James A. Bednar Jul 13 '21 at 20:02

1 Answers1

1

Try agg=ds.by('category', ds.any()), which will ignore polygons that overlap in any pixel. ds.count_cat('category') is now an alias for ds.by('category', ds.count()), but as of Datashader 0.12.1 you are no longer limited to just count, and can e.g. use any to discard information about overlaps.

James A. Bednar
  • 3,195
  • 1
  • 9
  • 13