0

I am working with high res NOAA aerial imagery data from Hurrican Ida, RGB. Data is available here under download - any of the date times will download a large amount of tifs https://storms.ngs.noaa.gov/storms/ida/index.html#18/29.46140/-90.30946

I want to merge multiple tifs along (x , y , bands) in order to create a complete 'mosaic', to identify any missing patches of imagery, and then create a building outline mask.

I have a function that can merge 20 Tifs, but above that it kills the kernel. Is there a better way to merge further? Ideally would want 40 in an image.

import rioxarray
from rioxarray import merge
from rasterio.plot import show 
 
def combine_tif_large(file_list, title=''):
        """Plot the combined image of a set of tif files - chunks long file list into groups of 50 (n) to prevent datasets closing
        file_list = list of file paths for files to be merged
        title = title for plot 
        """
        n = 50
        fig, ax = plt.subplots( figsize=(20,20))
        chunked_files = [file_list[i:i + n] for i in range(0, len(file_list), n)]
    
        final = [] 
        for chunk in chunked_files:
            now = datetime.now()
            current_time = now.strftime("%H:%M:%S")
            print("Current Time =", current_time)
            print('starting chunk..')
            elements = []
            for val in chunk:
                elements.append(rioxarray.open_rasterio(val))
            print('finish chunk')
            print('starting  merge')
            merged = merge.merge_arrays(elements, nodata=0.0)
            final.append(merged)
        print('starting final merge')
        merge_final = merge.merge_arrays(final, nodata=0.0)
        image = merge_final.values
        show(image, ax = ax, title= title) 
Grace
  • 67
  • 1
  • 8
  • 1
    If your kernel is being killed because the data is too large to fit into memory, it's probably too large to generate a single image from! You'll want to downsample the data somehow. the simplest approach would be to literally sample every Nth pixel... something like `da = rioxarray.open_rasterio(val); da = da[::100, ::100]`. You could use dask or all kinds of tricks to work with a larger-than-memory dataset but if you want to visualize the whole thing you're going to have to sample from or summarize it somehow. – Michael Delgado May 05 '22 at 01:43
  • 1
    thanks that makes a lot of sense! I guess I was thinking that I had to keep the resolution for the network, but thats not needed for the plot – Grace May 06 '22 at 14:07

0 Answers0