1

I want to download a netCDF4 file from a webpage. I can download the datafile, but there seems to be some errors in the file I downloaded using following codes:

import requests
from netCDF4 import Dataset


def download_file(url):
    local_filename = url.split('/')[-1]
    with requests.get(url, stream=True) as r:
        with open(local_filename, 'wb') as f:
            shutil.copyfileobj(r.raw, f)

    return local_filename


url = 'https://smos-diss.eo.esa.int/oads/data/SMOS_Open_V7/SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc'
local_filename = download_file(url)
sm_nc = Dataset(local_filename)

But finally I got error message:

Traceback (most recent call last):

  File "<ipython-input-98-809c92d8bce8>", line 1, in <module>
    sm_nc = Dataset(local_filename)

  File "netCDF4/_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.__init__

  File "netCDF4/_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success

OSError: [Errno -51] NetCDF: Unknown file format: b'SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc'

I also simply tried urllib.request.urlretrieve(url, './1.nc'), then sm_nc = Dataset('./1.nc'), but just got the following error message:

Traceback (most recent call last):

  File "<ipython-input-101-61d1f577421e>", line 1, in <module>
    sm_nc = Dataset('./1.nc')

  File "netCDF4/_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.__init__

  File "netCDF4/_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success

OSError: [Errno -51] NetCDF: Unknown file format: b'./1.nc'

But the thing is that, if I paste the url in the search box of my Safari or Chrome, then click download, the file I got is readable by netCDF4.Dataset. (You could also try that.) I tried with many other solutions but didn't work. So is there anybody who could do me a favour? Thanks! By the way, the requests and netCDF4 I am using are of version 2.26.0 and 1.5.3, urllib.request is of 3.7.

Xu Shan
  • 175
  • 3
  • 11

2 Answers2

2

Tiy probably want to use urlretrieve. The following call to urllib should work:

import urllib
new_x = "/tmp/temp.nc"
x = "https://smos-diss.eo.esa.int/oads/data/SMOS_Open_V7/SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc"
urllib.request.urlretrieve(x, new_x)
Robert Wilson
  • 3,192
  • 11
  • 19
  • still does not work, ```Dataset("/tmp/temp.nc")```, then I got error message: ```Traceback (most recent call last): File "", line 1, in Dataset('/tmp/temp.nc') File "netCDF4/_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.__init__ File "netCDF4/_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success OSError: [Errno -51] NetCDF: Unknown file format: b'/tmp/temp.nc'``` – Xu Shan Feb 24 '22 at 15:46
  • oh, in my original problem description, I have used urllib.request.urlretrieve... – Xu Shan Feb 24 '22 at 15:47
  • I get that message if I try downloading the file using wget. It's more likely this is a data problem – Robert Wilson Feb 24 '22 at 17:00
  • no, if you pasted it in the search box of the browser, then download it manually, the file is ok. – Xu Shan Feb 24 '22 at 18:34
  • just found it out, the reason is that I need to log in...but still thanks! – Xu Shan Apr 11 '22 at 09:02
1

When I try to wget it gives me nc file but I am not sure it size is 19 KB. You can use wget in python if this file okey for you.

wget https://smos-diss.eo.esa.int/oads/data/SMOS_Open_V7/SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc

But it is not readable because if you try access without login to site, it gives meaningless file. Just paste this link to your browser then login it gives 6 MB file which I'm sure it is readable. Still if you want to get file with python script check selenium that provide click on the website so you can login then download your file with script.

updraftman
  • 36
  • 3
  • Hi thanks for your reply! But this file is still not readable...you could use python "netCDF4.Dataset" to try it...but if you download it by past this link to the browser, it can be read...as stated in my question description... – Xu Shan Mar 16 '22 at 15:07
  • I updated my answer please check it. – updraftman Mar 17 '22 at 17:32
  • Yes, I just found that I need to login...because when I tested "paste link to browser then download", my account is logged in at that time...in that case in my code I need to write a login for the downloading...thanks! – Xu Shan Mar 19 '22 at 10:15