1

I am new to using GDAL in Python and I am trying to use it to retrieve the band data from Sentinel 2 SAFE-products. I managed to extract the band array, but I couldn't manage to get it scaled correctly.

This extracts the unscaled array of Band 4:

import gdal

product_path = "S2B_MSIL2A_20200124T101219_N0213_R022_T33UUU_20200124T121752.SAFE"
dataset = gdal.Open(product_path + "MTD_MSIL2A.xml")
bands10m_path = dataset.GetSubDatasets()[0][0]
bands10m_dataset = gdal.Open(bands10m_path)
b4_band = bands10m_dataset.GetRasterBand(1)
b4_array = b4_band.ReadArray()

So far so good, but the data type of the array is uint16 and the values range from 0 to 16896.

b4_band.GetMinimum() and b4_band.GetMaximum() both return None.

And b4_band.GetStatistics(True,True) returns [0.0, 2829.0, 347.05880000000104, 334.8397839901348] (as min, max, mean, stddev).

Does this help me somehow to extract the correct scale? I am clueless...

jakobxu
  • 15
  • 5
  • `b4_array` should be a `numpy` array, in which case you can use the standard `.min()` and `.max()` methods – Val Feb 05 '20 at 16:54
  • @Val sure, but I am looking for the minimum and maximum of the reflectance in order to translate the ```b4_array``` into reflactance values. – jakobxu Feb 05 '20 at 18:02
  • Well, you can convert the array by using its min/Max values and write it back to the geotiff. That aside, it being a Sentinel-2 scene, the values should be be DN which are converted to reflectances using a fixed scale factor, not min/max. – Val Feb 05 '20 at 20:51
  • @Val thanks for the answer. Do you know where to find such a scale factor? calling ```b4_band.GetScale()``` would just return ```1.0```. – jakobxu Feb 05 '20 at 22:09
  • Google is generally a good place ;) - what you are looking for is the `QUANTIFICATION VALUE`. S2 L2A products are already BOA reflectances, just multiplied by that value for storage reasons. It used to be 10000, I don't think it has changed. – Val Feb 06 '20 at 08:22
  • @Val - I double checked the quantification value and it is still 10000 and the values make sense now. Thanks so much for your help, you saved me a lot of time. – jakobxu Feb 06 '20 at 09:45

1 Answers1

2

It's good to be aware that even if the scale/offset are specified in the file, GDAL won't automatically apply them.

In the case of Sentinel 2 they are not specified in the file, but in the metadata (xml). You can look in the xml-file you are using in your example with a text editor. And search for "QUANTIFICATION_VALUE" as @Val suggested.

It can also be retrieved from the metadata as parsed by GDAL. This can be done with dataset.GetMetadata() which will return a dictonary. You can also call the gdal.Info utilty, both methods are shown below.

import gdal

archive = 'S2A_MSIL2A_20200126T082221_N0213_R121_T34HDH_20200126T110103.zip'

# Use a dataset
ds = gdal.Open(archive)
meta = ds.GetMetadata()
ds = None

# Alternatively use gdal.Info
r = gdal.Info(archive, format='json')
meta = r['metadata']['']

You can filter out the relevant values, and convert them from string to float with something like:

{k: float(v) for k,v in meta.items() if k.endswith('_QUANTIFICATION_VALUE')}

# Result in:
{'AOT_QUANTIFICATION_VALUE': 1000.0,
 'BOA_QUANTIFICATION_VALUE': 10000.0,
 'WVP_QUANTIFICATION_VALUE': 1000.0}
Rutger Kassies
  • 61,630
  • 17
  • 112
  • 97