so There is a 4GB .TIF image that needs to be processed, as a memory constraint I can't load the whole image into numpy array so I need to load it lazily in parts from hard disk. so basically I need and that needs to be done in python as the project requirement. I also tried looking for tifffile library in PyPi tifffile but I found nothing useful please help.
Asked
Active
Viewed 422 times
0
-
1Consider using `pyvips` - it is very frugal on memory. Maybe add the `vips` tag to attract the right people. Also try and be more explicit about your processing. – Mark Setchell Dec 04 '19 at 19:55
2 Answers
1
pyvips can do this. For example:
import sys
import numpy as np
import pyvips
image = pyvips.Image.new_from_file(sys.argv[1], access="sequential")
for y in range(0, image.height, 100):
area_height = min(image.height - y, 100)
area = image.crop(0, y, image.width, area_height)
array = np.ndarray(buffer=area.write_to_memory(),
dtype=np.uint8,
shape=[area.height, area.width, area.bands])
The access
option to new_from_file
turns on sequential mode: pyvips will only load pixels from the file on demand, with the restriction that you must read pixels out top to bottom.
The loop runs down the image in blocks of 100 scanlines. You can tune this, of course.
I can run it like this:
$ vipsheader eso1242a-pyr.tif
eso1242a-pyr.tif: 108199x81503 uchar, 3 bands, srgb, tiffload_stream
$ /usr/bin/time -f %M:%e ./sections.py ~/pics/eso1242a-pyr.tif
273388:479.50
So on this sad old laptop it took 8 minutes to scan a 108,000 x 82,000 pixel image and needed a peak of 270mb of memory.
What processing are you doing? You might be able to do the whole thing in pyvips. It's quite a bit quicker than numpy.

jcupitt
- 10,213
- 2
- 23
- 39
1
import pyvips
img = pyvips.Image.new_from_file("space.tif", access='sequential')
out = img.resize(0.01, kernel = "linear")
out.write_to_file("resied_image.jpg")
if you want to convert the file to other format have a smaller size this code will be enough and will help you do it without without any memory spike and in very less time...

Sudhanshu Shivam
- 11
- 1
-
I downloaded a file from of 1.6 GB and it works without any spike in memory usage – Sudhanshu Shivam Dec 09 '19 at 11:40
-
also you don't load it to numpy before conversion that will be better – Sudhanshu Shivam Dec 09 '19 at 11:41