Flat-field correction on hyperspectral data

Question

I am working on hyperspectral data set using the spectral python library. I started using python for the first time on Monday, so everything is taking me a long time.

My data is in envi format, and i believe I have successfully read it in and connverted to numpy arrays.

I am attempting a flat field correction using this code

corrected_nparr = np.divide(np.subtract(data_nparr, dark_nparr), np.subtract(white_nparr, dark_nparr))

ValueError: operands could not be broadcast together with shapes (1367,384,288) (100,384,288)

This doesnt work because my white reference and dark reference are a different size to the data capture.

print(white_nparr.shape)
(297, 384, 288)
print(dark_nparr.shape)
(100, 384, 288)
print(data_nparr.shape)
(1367, 384, 288)

So, I understand why I am getting the error. The original white and dark ref were captured using different image sizes to the dataset. So, my problem is creating a correction for the dataset whilst only having access to references of different sizes

Has anyone handled this before? What approach did you use?

btw the data I am using is mineral hyperspectral data captured from drill core, there is a huge dataset held by Geological Survey Ireland and is free upon request

So, I recieved and extremely helpful answer, which actually sparked a further question

# created these files to broadcast as they are a horizontal line of spectra,
#a 2D array which captures the variation 
white_nparr_horiz = white_nparr[-2] 
dark_nparr_horiz = dark_nparr[-2] 
corrected_nparr = np.divide(np.subtract(data_nparr, dark_nparr_horiz), np.subtract(white_nparr_horiz, dark_nparr_horiz)) 

white_nparr_horiz.shape 
Out[28]: (384, 288) 
dark_nparr_horiz.shape Out[29]: (384, 288)

So the shape of these arrays are broadcastable accross the data_ref, and I have tested that it works as I expect with this, on a few different indices, and it does.

a = white_nparr_horiz[150, 144]
b = dark_nparr_horiz[150, 144]
c = data_nparr[500, 150, 144]
d = (c - b)/(a-b)

test = d == corrected_nparr[500, 150, 144]

print(test)

The output from this looks much more as I would expect reflectance data for this material to look, so I believe I am on the right path.

What I would like to do now is have white_nparr_horiz be the mean of each band along the original first axis in the white_ref (297, 384, 288), returned in an array of (384, 288), as opposed to a single value as I believe it is now. I am sure that this is possible, but I cannot figure out how.

As I said above, very new to python, numpy and image analysis, so apologies if this is obvious or I am going in the wrong direction

It will be easier to help you if you also provide the shapes of the three arrays, as well as the error message. — bogatron, Sep 07 '22 at 22:01

bogatron · Accepted Answer · 2022-09-12T11:54:49.047

0

The problem is that your white and dark references should each be a single spectrum (1D array with 288 values), whereas yours are both 3-dimensional arrays (likely corresponding to image regions). To convert them to 1D, you can compute the mean, max, or min of each array, as appropriate. For example, to take the min of the dark reference and max of the white reference, you could convert them as follows:

dark_nparr = np.min(dark_nparr.reshape(-1, dark_nparr.shape[-1]), axis=0)
white_nparr = np.max(white_nparr.reshape(-1, white_nparr.shape[-1]), axis=0)

The lines above reshape the arrays to 2 dimensions and compute the max (or min) of the reshaped arrays.

If you prefer to use the spectral mean of each array instead, just replace np.max and np.min above with np.mean.

If you want each array to just be averaged over its first dimension, then (i.e., have shape (384, 288)), then just don't reshape the arrays when doing the reduction.

dark_nparr = np.min(dark_nparr, axis=0)
white_nparr = np.max(white_nparr, axis=0)

edited Sep 12 '22 at 11:54

answered Sep 09 '22 at 12:36

bogatron

18,639
6
53
47

Thanks for your answer! If I understand it, this will produce a white and dark ref that is a spectrum of the wavelengths captured for a single pixel? and then i can use it to populate an array of the same shape as the data_ref and perform the correction? – russj Sep 09 '22 at 16:23
There is no need to populate larger arrays. numpy [broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) will take care of that when doing the subtraction. – bogatron Sep 09 '22 at 21:10
Okay, so I have been playing with this, and your answer does not get the desired result. Not your fault, as you cannot be expect to read my mind, but I have come up with this; ~~~ # created these files to broadcast as they are a horizontal line of spectra, a 2D array which captures the variation white_nparr_horiz = white_nparr[-2] dark_nparr_horiz = dark_nparr[-2] corrected_nparr = np.divide(np.subtract(data_nparr, dark_nparr_horiz), np.subtract(white_nparr_horiz, dark_nparr_horiz)) white_nparr_horiz.shape Out[28]: (384, 288) dark_nparr_horiz.shape Out[29]: (384, 288) ~~~ – russj Sep 11 '22 at 18:50
The two "_horiz" arrays at the end of your comment still need to be converted to 1D by min, max, or mean (or some other reduction operation) along the first axis. – bogatron Sep 11 '22 at 21:54
They do seem to broadcast correctly to the data array as 2D arrays using the np.subtract and np. divide functions. I have edited my original question to give more details – russj Sep 12 '22 at 07:52

Flat-field correction on hyperspectral data

1 Answers1