I'm beginning to play with GeoPySpark and am implementing an example notebook.
I successfully retrieved the images
!curl -o /tmp/B01.jp2 http://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/NM/2017/1/4/0/B01.jp2
!curl -o /tmp/B09.jp2 http://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/NM/2017/1/4/0/B09.jp2
!curl -o /tmp/B10.jp2 http://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/NM/2017/1/4/0/B10.jp2
Here is the script:
import rasterio
import geopyspark as gps
import numpy as np
from pyspark import SparkContext
conf = gps.geopyspark_conf(master="local[*]", appName="sentinel-ingest-example")
pysc = SparkContext(conf=conf)
jp2s = ["/tmp/B01.jp2", "/tmp/B09.jp2", "/tmp/B10.jp2"]
arrs = []
for jp2 in jp2s:
with rasterio.open(jp2) as f: #CRASHES HERE
arrs.append(f.read(1))
data = np.array(arrs, dtype=arrs[0].dtype)
data
The script crashes where I placed the marker here, with the following error:
RasterioIOError: '/tmp/B01.jp2' not recognized as a supported file format.
I copy-pasted the example code exactly, ad in the Rasterio docs it even uses .jp2 files in examples.
I'm using the following version of Rasterio, installed with pip3. I do not have Anaconda installed (messes up my Python environments) and do not have GDAL installed (it refuses to, that would be the topic of another question if it is my only solution)
Name: rasterio
Version: 1.1.0
Summary: Fast and direct raster I/O for use with Numpy and SciPy
Home-page: https://github.com/mapbox/rasterio
Author: Sean Gillies
Author-email: sean@mapbox.com
License: BSD
Location: /usr/local/lib/python3.6/dist-packages
Requires: click-plugins, snuggs, numpy, click, attrs, cligj, affine
Required-by:
Why does it refuse to read .jp2 files? Is there maybe a way to convert them to something usable? Or do you know of any example files similar to these ones in an acceptable format?