0

I'm trying to download a file from inside a tar file on an ftp server. similar to this Read contents of .tar.gz file from website into a python 3.x object when i go to open the tarfile i get an ReadError (below)

ftpURL = u'ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/wave/prod/multi_1.20170201/multi_1.t00z.spec_tar.gz'
ftpstream = urllib.urlopen(ftpURL)
tar = tarfile.open(fileobj=ftpstream, mode='r|bz2')    # here's where i get the error 
Traceback (most recent call last):
  File "C:\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-20-c3e97355618c>", line 1, in <module>
    tar = tarfile.open(fileobj=ftpstream, mode='r|bz2')
  File "C:\Anaconda2\lib\tarfile.py", line 1703, in open
    t = cls(name, filemode, stream, **kwargs)
  File "C:\Anaconda2\lib\tarfile.py", line 1587, in __init__
    self.firstmember = self.next()
  File "C:\Anaconda2\lib\tarfile.py", line 2355, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "C:\Anaconda2\lib\tarfile.py", line 1251, in fromtarfile
    buf = tarfile.fileobj.read(BLOCKSIZE)
  File "C:\Anaconda2\lib\tarfile.py", line 579, in read
    buf = self._read(size)
  File "C:\Anaconda2\lib\tarfile.py", line 598, in _read
    raise ReadError("invalid compressed data")
ReadError: invalid compressed data

Am i missing something with the buffer size? If so, not being familiar with buffer size, where would i find particular information regarding the needed buffer size, I've to double and triple the size to no avail. I've also tried a few files. I'm able to download the file manually and open it on my machine.... any help is much appreciated

Community
  • 1
  • 1
SBFRF
  • 167
  • 2
  • 16

1 Answers1

0

Look closer at the signature:

tarfile.open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs)

And the description:

If given, fileobj may be any object that has a read() or write() method (depending on the mode). bufsize specifies the blocksize and defaults to 20 * 512 bytes. Use this variant in combination with e.g. sys.stdin, a socket file object or a tape device. However, such a TarFile object is limited in that it does not allow random access, see Examples.

What you meant to do was:

tar = tarfile.open(fileobj=ftpstream, mode='r|bz2')
TkTech
  • 4,729
  • 1
  • 24
  • 32
  • Thanks, I should have seen that. However my problem is still not solved. I've edited the post with new issues. – SBFRF Feb 01 '17 at 18:27
  • Changing your post each time your problem changes as you try to brute-force a fix is not how stackoverflow works. – TkTech Feb 01 '17 at 20:32
  • well @TkTech I'm sorry if my methodology is not how you would prefer, how would you suggest I find a solution for my problem? – SBFRF Feb 01 '17 at 20:41