0

I try to mask AOI area on raster image with rasterio

I use rasterio to mask area of interest and then define the rest extent as no data

My goal is to keep original raster bounds ,select area of interest, but reduce image size

I use following wonderful doc Masking a raster using a shapefile from raterio website

I changed following line from original procedure

with rasterio.open("tests/data/RGB.byte.tif") as src:
    out_image, out_transform = rasterio.mask.mask(src, shapes, crop=True)
    out_meta = src.meta

to

with rasterio.open("tests/data/RGB.byte.tif") as src:
    out_image, out_transform = rasterio.mask.mask(src, shapes, filled=True , nodata=0)
    out_meta = src.meta

Result is here, exactly what I want :

result

But the new raster file size remains same !! I expect reduce image size to half or more !

So I tried following :

gdal_translate -of GTiff -a_nodata 0 input.tif output.tif

or

gdal_translate -of GTIFF -scale -a_nodata 0 myVrt.vrt output.tif

Both don't help

Toren
  • 6,648
  • 12
  • 41
  • 62
  • Do you want to resample the image to a lower resolution? That step would be independent of the masking, take a look here: https://github.com/mapbox/rasterio/blob/master/docs/topics/resampling.rst – Christoph Rieke Jan 16 '19 at 17:59
  • https://stackoverflow.com/users/6400555/christoph-rieke Thanks for comment . 1) I don't want to resample .2) I'd like to keep image bounds 3) I'd like to replace zero values (black color) with "NoData" to reduce dramatically stored raster file size – Toren Jan 16 '19 at 19:12
  • Could you provide a reproducible example? Would make things easier – Val Jan 17 '19 at 08:56
  • A quick hack would be to enable compression through the creation options (look at the GeoTiff creation options on the gdal GeoTiff format page). The large blocks of zeros will be heavily compressed and reduce the size dramatically. – Benjamin Mar 14 '19 at 01:23

1 Answers1

0

The image is large because you're not using any compression on the output. Each pixel is being stored verbosely in your image, so e.g. a 2048x2048 RGB file will take up ~4M pixels times 3 bytes per pixel = 12 MB. Basic TIFF images are not compressed, you have toe enable this as an option when you convert.

In most cases, JPEG is the simplest compression method to use:

gdal_translate -of GTIFF \
               -scale \
               -a_nodata 0 \
               -co COMPRESS=JPEG \
               -co PHOTOMETRIC=YCBCR myVrt.vrt output.tif

If you have multispectral images, or more than 3 bands (or 16-bit data) then this won't work, but you can use other compression modes like PNG, LZW or ZSTD. A quick workaround if you have > 3 channels, is to select bands using the -b flag which you can use multiple times:

-b 1 -b 2 -b 3

Have a read of this for more information.

Josh
  • 2,658
  • 31
  • 36
  • This has nothing to do with compression and everything to do with intelligently setting raster bounds in `rasterio`. – user32882 Dec 29 '22 at 10:38
  • I disagree. OP is storing a raw image and says "I want to store zero values with nodata to reduce file size" and says that the original bounds must be preserved. So we have to store the full raster and apply compression (to save space from all the empty pixels), even if a mask is provided as a nodata, sidecar or an alpha channel. It might be cleaner to store a cropped image and then embed it into an image with the original bounds on load, but that's not what was asked. – Josh Jan 05 '23 at 20:38