2

Wondering if there's a speedy way to return specific pixel ranges of a given channel of an ome-tiff file using pyvips / libvips. The crop doesn't allow for channel specfics.

My OME-Tiff is large (10 GB+) so I don't want to load the entire image into memory.

Open to any suggestions and/or other workflows.

  • You could lazy access the data as a [zarr](https://pypi.org/project/zarr/) array or group via [tifffile](https://pypi.org/project/tifffile/). E.g. `zarr.open(tifffile.imread('multi-channel-z-series.ome.tif', aszarr=True), mode='r')[1, 1:4, 32:64, 32:64]`. – cgohlke Oct 22 '20 at 17:58

1 Answers1

1

pyvips supports multipage documents as "toilet-roll" images (sorry). You set n=-1 to load all the pages, and they appear as a very tall, thin image, with the pages stacked vertically. The metadata item page-height gives the height in pixels of each sheet.

Docs here:

https://libvips.github.io/libvips/API/current/VipsForeignSave.html#vips-tiffload

For example:

$ vipsheader -a multi-channel-z-series.ome.tif 
multi-channel-z-series.ome.tif: 439x167 char, 1 band, b-w, tiffload
width: 439
height: 167
bands: 1
format: char
coding: none
interpretation: b-w
xoffset: 0
yoffset: 0
xres: 0
yres: 0
filename: multi-channel-z-series.ome.tif
vips-loader: tiffload
n-pages: 15
image-description: <?xml version="1.0" encoding="UTF-8"?><!-- Warning: this comment is an OME-XML metadata block, which contains crucial dimensional parameters and other important metadata. Please edit cautiously (if at all), and back up the original data before doing so...
resolution-unit: cm
orientation: 1

You can see this is a 15 page OME image. pyvips will load page 0 by default, and each page is 439 by 167 pixels. You can fetch the XML in image-description to see the full OME channel metadata.

$ vipsheader -f image-description multi-channel-z-series.ome.tif
<?xml version="1.0" encoding="UTF-8"?>
<!--- ... etc.

In Python you can do:

$ python3
Python 3.8.5 (default, Jul 28 2020, 12:59:40) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyvips
>>> x = pyvips.Image.new_from_file("multi-channel-z-series.ome.tif", n=-1)
>>> x.size
>>> x.width
439
>>> x.height
2505
>>> x.get("page-height")
167
>>> x.height / x.get("page-height")
15.0

So you can use crop to fetch a rect from a channel in the obvious way.

Are you planning to generate patches for ML training? If you are, fetch can be much faster than crop for small patches. This issue has sample code and some benchmarks --- in that example, crop takes 41s to make 12,000 32x32 patches, but fetch takes just 0.5s.

jcupitt
  • 10,213
  • 2
  • 23
  • 39
  • Thanks John, Appreciate your ownership of the library. Quick question -- if i'm just trying to load the nth channel, can i just give `n=10` to load the 10th channel, or is it quicker to do `n=-1` and do the vertical offset math. – Simon Warchol Oct 22 '20 at 18:21
  • Nvm -- i see that n is the number of pages. – Simon Warchol Oct 22 '20 at 19:09
  • Use `page=10` to pull out just the 10th page, use `n=` to set the number of pages to fetch. Tiles are decompressed on demand, so it won't make much difference to runtime. – jcupitt Oct 23 '20 at 08:49