0

I'm writing a Python script that creates a COVID-19 dashboard for my country and state and updates it daily.

However, I am struggling to download one of the necessary files.

Basically to download the file I have to access the website (https://covid.saude.gov.br/) and click on a button (class="btn-white md button button-solid button-has-icon-only ion-activatable ion-focusable hydrated ion-activated").

I tried to download via the download link but the site creates a different link every time you click the button and it still has a blob URL before HTTP.

I am very grateful to anyone who tries to help, because the data will be used to monitor the progress of the disease here where I live.

Neves
  • 3
  • 3

1 Answers1

0

You can use their API to get the file name:

import requests

headers = {
        'authority':'xx9p7hp1p7.execute-api.us-east-1.amazonaws.com',
        'x-parse-application-id':'unAFkcaNDeXajurGB7LChj8SgQYS2ptm',
          }

with requests.Session() as session:
    session.headers.update(headers)
    resp = session.get('https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeral').json()
    path = resp['results'][0]['arquivo']['url']

The x-parse-application-id doesn't seem to change. If it does, you can get the correct one by querying https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeralApi and extract it from ['planilha']['arquivo'][url].

Gregor
  • 588
  • 1
  • 5
  • 19