0

The following code, downloads a zip file, and stores the archive that is contained in it; it doesn't gives any error message.

from io import BytesIO
import zipfile as zf
from urllib.request import urlopen

import pickle as pc  # file manager
resp = urlopen('ftp://ftp.ibge.gov.br/Precos_Indices_de_Precos_ao_Consumidor/IPCA/Serie_Historica/ipca_SerieHist.zip')
zipfile = zf.ZipFile(BytesIO(resp.read()))

zipped_filenames = zipfile.namelist()
for filename in zipped_filenames:
    print('Filename: ', filename)

    xls_file = zipfile.read(filename)
    with open(filename, 'wb') as output:
        pc.dump(xls_file, output, pc.HIGHEST_PROTOCOL)

Output:

Filename:  ipca_201807SerieHist.xls

When I tried to open the file 'ipca_201807SerieHist.xls' (downloaded and extracted with the above code) with Libre Office, LO doesn't recognize the file and tries to import it.

If I go to the URL: 'ftp://ftp.ibge.gov.br/Precos_Indices_de_Precos_ao_Consumidor/IPCA/Serie_Historica/ipca_SerieHist.zip', save the 'ipca_SerieHist.zip' file in the HD, and then extract and open the 'ipca_201807SerieHist.xls' file, Libre Office recognizes the file.

Both file 'ipca_201807SerieHist.xls' have similar sizes; the one downloaded is slightly larger 62994 bytes vs 62976 bytes. If I compare the content, with the exception of some isolated characters, they seem to be pretty similar.

Note: The 'ipca_201807SerieHist.xls' is in portuguese.

user3889486
  • 656
  • 1
  • 7
  • 21
  • 1
    Why do you use `pickle.dump` and not `output.write`? – mkrieger1 Aug 27 '18 at 20:44
  • Possible duplicate of [Extract a specific file from a zip archive without maintaining directory structure in python](https://stackoverflow.com/questions/17729703/extract-a-specific-file-from-a-zip-archive-without-maintaining-directory-structu) – mkrieger1 Aug 27 '18 at 20:46

1 Answers1

0

As mkrieger1 mentioned, just changing the very last line to the following solved the issue.

for filename in zipped_filenames:
    print('Filename: ', filename)

    xls_file = zipfile.read(filename)
    with open(filename, 'wb') as output:
        output.write(xls_file)
user3889486
  • 656
  • 1
  • 7
  • 21