How could I scrape with Python the data from this page, specifically from the charts? I've tried beautifulsoup
but I've inspeced the HTML page and it seems not to be in any available tag to scrape.
I can't find the numbers present in charts at my request response, and I also coudn't find them at inspect HTML (see image below).
Input
from bs4 import BeautifulSoup
import requests
url = "https://viz.saude.gov.br/extensions/CobVac_MOV/CobVac_MOV.html"
r = requests.get(url)
soup = BeautifulSoup(r.text, "html")
print(soup.prettify())
Output
<!DOCTYPE html>
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="content-type"/>
<meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/>
<title>
MS-SUS COVID-19 Distribuição de Vacinas
</title>
<meta charset="utf-8"/>
<meta content="True" name="HandheldFriendly"/>
<meta content="320" name="MobileOptimized"/>
<meta content="width=device-width, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0, user-scalable=no" name="viewport"/>
<meta content="yes" name="apple-mobile-web-app-capable"/>
<meta content="black" name="apple-mobile-web-app-status-bar-style"/>
<meta content="on" http-equiv="cleartype"/>
<!--Polymer stuff -->
<script src="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/webcomponentsjs/webcomponents-lite.min.js">
</script>
<script src="https://kit.fontawesome.com/a076d05399.js">
</script>
<link href="qliksense-card.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/iron-flex-layout/iron-flex-layout-classes.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/paper-header-panel/paper-header-panel.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/paper-toolbar/paper-toolbar.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/paper-drawer-panel/paper-drawer-panel.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/paper-icon-button/paper-icon-button.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/iron-icons/iron-icons.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/iron-pages/iron-pages.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/paper-menu/paper-menu.html" rel="import"/>
<link href="https://cdn.rawgit.com/download/polymer-cdn/1.7.0.2/lib/paper-item/paper-item.html" rel="import"/>
<link href="polymer-mixins.html" rel="import"/>
<style include="iron-flex iron-positioning" is="custom-style">
</style>
<style include="polymer-mixins" is="custom-style">
</style>
<!-- Bootstrap css -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet"/>
<!-- Font Awesome -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet"/>
<!-- Qlik -->
<link href="../../resources/autogenerated/qlik-styles.css" rel="stylesheet"/>
<script src="../../resources/assets/external/requirejs/require.js">
</script>
<!-- Bootstrap js -->
<script crossorigin="anonymous" src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js">
</script>
<!-- google fonts -->
<link href="https://fonts.googleapis.com/css?family=Source+Sans+Pro" rel="stylesheet"/>
<!-- Project code -->
<link href="CobVac_MOV.css" rel="stylesheet"/>
<script src="CobVac_MOV.js">
</script>
<!-- fontawesome -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" rel="stylesheet"/>
</head>
<body class="fullbleed vertical layout">
<paper-drawer-panel disable-edge-swipe="true" force-narrow="true" right-drawer="" z-index="1000">
<!-- FILTROS INI ============================================================ -->
<div drawer="">
<div class="drawer-title">
Filtros
</div>
<div class="filter-container">
<div class="qvobject" id="qvfilters">
</div>
</div>
</div>
<!-- FILTROS FIM ============================================================ -->
<!-- PAGINA INI ============================================================ -->
<paper-header-panel main="">
<!-- HEADER INI ============================================================ -->
<div class="paper-header">
<paper-toolbar style="background-color: #306BBC; color: #ffffff;">
<paper-icon-button class="visible-xs-block" icon="menu" id="nav-menu-button">
</paper-icon-button>
<img src="LOGO_TOPO.png" style="height:33px; width:161px;"/>
<div class="title" style="font-size:18px;">
<b>
COVID-19 Vacinação
<br/>
Distribuição de Vacinas
</b>
</div>
<!--TITLE-->
<paper-icon-button class="filter-drawer-toggle" icon="search" paper-drawer-toggle="">
</paper-icon-button>
<paper-icon-button class="filter-drawer-toggle" data-target="#basic" data-toggle="modal" icon="help">
</paper-icon-button>
</paper-toolbar>
<!-- BARRA DE FILTROS =================== more-vert -->
<div class="qvobjects" id="CurrentSelections" style="position:relative; top:0; left:0; width:100%; height:38px;">
</div>
</div>
<!-- HEADER FIM ============================================================ -->
<!-- PAGINA UTIL INI ============================================================ -->
<paper-drawer-panel drawer-width="0px" id="nav-drawer">
<!-- PAGINAS ## INI ============================================================ -->
<iron-pages main="" selected="0" style="background-color:#eee;">
<!-- Each .paper-body contained within <iron-pages> is a view. Copy and paste to add more views. -->
<!-- Don't forget to add a <paper-item> in the <paper-menu> above to be able to navigate to any view you add -->
<!-- ========================== -->
<!-- PAGINA 0 -->
<!-- ========================== -->
<div class="paper-body">
<div class="container-fluid">
<!-- A .qvplaceholder will become a droppable area in the dev-hub -->
<!-- Each .qvplaceholder must have a unique id -->
<!-- These .qvplaceholder objects below have an extra class, .kpi, which applies some simple styles intended for kpi objects -->
<!--
<div class="row">
<p style="color:red">
<b>IMPORTANTE: As informações mostradas neste painel referem-se apenas às doses enviadas a partir do Ministério da Saúde.</b>
</p>
</div>
-->
<!--<div class="row">
<b>DOSES ENVIADAS PELO MINISTÉRIO DA SAÚDE AOS ESTADOS</b>
</div>-->
<!-- =================== -->
<!-- KPIS -->
<!-- =================== -->
<!-- ====================================================== -->
<!-- Ver os icones em https://fontawesome.com/v4.7.0/icons/ -->
<!-- ====================================================== -->
<div class="row kpi-row">
<div class="col-xs-12 col-sm-12 col-lg-7">
<div class="kpi-side">
<i class="fas fa-syringe">
</i>
</div>
<div class="kpi corkpi01 qvobject" id="KPI-01">
</div>
</div>
<!--<div class="col-xs-12 col-sm-6 col-lg-3">
<div class="kpi-side"><i class="fas fa-syringe"></i></div>
<div class="kpi corkpi02 qvobject" id="KPI-02"></div>
</div>
<div class="col-xs-12 col-sm-6 col-lg-4">
<div class="kpi-side"><i class="fas fa-syringe"></i></div>
<div class="kpi corkpi03 qvobject" id="KPI-03"></div>
</div>
<div class="col-xs-12 col-sm-6 col-lg-8">
<div class="kpi-side"><i class="fas fa-syringe"></i></div>
<div class="kpi corkpi04 qvobject" id="KPI-04"></div>
</div>-->
</div>
<div class="row">
<p>
<b>
<a href="https://sage.saude.gov.br/sistemas/vacina/documentosVacina.php">
Acesse aqui
</a>
os arquivos com os comprovantes de recebimento pelos Estados.
<br/>
</b>
<!--<br>
Esclarecimento: Doses em Trânsito são aquelas que estão sendo enviadas pelos Estados aos seus Municípios.
</p>-->
</p>
</div>
<!-- =================== -->
<!-- GRAFICOS 0.1 UF, MAPA -->
<!-- =================== -->
<div class="row">
<div class="col-xs-12 col-sm-8">
<!-- Placing a .qvplaceholder within a <qliksense-card> will create a cardified object -->
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G01A">
</div>
</qliksense-card>
</div>
<div class="col-xs-12 col-sm-4">
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G01B">
</div>
</qliksense-card>
</div>
</div>
<!-- =================== -->
<!-- GRAFICOS 0.2 VACINA, TEMPO -->
<!-- =================== -->
<div class="row">
<div class="col-xs-12 col-sm-6">
<!-- Placing a .qvplaceholder within a <qliksense-card> will create a cardified object -->
<qliksense-card content-height="400px">
<div class="with-title qvobject" id="QV1-G02A">
</div>
</qliksense-card>
</div>
<div class="col-xs-12 col-sm-6">
<qliksense-card content-height="400px">
<div class="with-title qvobject" id="QV1-G02B">
</div>
</qliksense-card>
</div>
</div>
<!-- =================== -->
<!-- GRAFICOS 0.3 TABELA_UF PERCENTUAL_REPASSE -->
<!-- =================== -->
<div class="row">
<div class="col-xs-12 col-sm-5">
<!-- Placing a .qvplaceholder within a <qliksense-card> will create a cardified object -->
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G03A">
</div>
</qliksense-card>
</div>
<div class="col-xs-12 col-sm-7">
<!-- Placing a .qvplaceholder within a <qliksense-card> will create a cardified object -->
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G03B">
</div>
</qliksense-card>
</div>
</div>
<!-- ================================================================================================================== -->
<!--<div class="row">
<b>DOSES REPASSADAS PELOS ESTADOS AOS MUNICÍPIOS</b>
</div>-->
<!-- =================== -->
<!-- KPIS -->
<!-- =================== -->
<!-- ====================================================== -->
<!-- Ver os icones em https://fontawesome.com/v4.7.0/icons/ -->
<!-- ====================================================== -->
<div class="row kpi-row">
<!--<div class="col-xs-12 col-sm-12 col-lg-3">
<div class="kpi-side"><i class="fas fa-syringe"></i></div>
<div class="kpi corkpi01 qvobject" id="KPI-01B"></div>
</div>
<div class="col-xs-12 col-sm-6 col-lg-3">
<div class="kpi-side"><i class="fas fa-syringe"></i></div>
<div class="kpi corkpi02 qvobject" id="KPI-02B"></div>
</div>
<div class="col-xs-12 col-sm-6 col-lg-3">
<div class="kpi-side"><i class="fas fa-syringe"></i></div>
<div class="kpi corkpi03 qvobject" id="KPI-03B"></div>
</div>-->
<div class="col-xs-12 col-sm-6 col-lg-6">
<div class="kpi-side">
<i class="fas fa-syringe">
</i>
</div>
<div class="kpi corkpi04 qvobject" id="KPI-04B">
</div>
</div>
</div>
<!--<div class="row">
<b>Esclarecimento: Doses em Trânsito são aquelas que estão sendo enviadas pelos Estados aos seus Municípios.</b>
</div>-->
<!-- =================== -->
<!-- MAPAS 0.4 MN RELOGIO -->
<!-- =================== -->
<div class="row">
<div class="col-xs-12 col-sm-8">
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G04A">
</div>
</qliksense-card>
</div>
<div class="col-xs-12 col-sm-4">
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G04B">
</div>
</qliksense-card>
</div>
</div>
<!-- =================== -->
<!-- GRAFICOS 0.5 VACINA TEMPO -->
<!-- =================== -->
<div class="row">
<div class="col-xs-12 col-sm-6">
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G05A">
</div>
</qliksense-card>
</div>
<div class="col-xs-12 col-sm-6">
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G05B">
</div>
</qliksense-card>
</div>
</div>
<!-- =================== -->
<!-- GRAFICOS 0.5 TABELA -->
<!-- =================== -->
<div class="row">
<div class="col-xs-12 col-sm-6">
<qliksense-card content-height="300px">
<div class="with-title qvobject" id="QV1-G06A">
</div>
</qliksense-card>
</div>
</div>
<!-- ====================================================== -->
<!-- EXPORT -->
<!-- ====================================================== -->
<div class="row kpi-row">
<div class="col-xs-12 col-sm-12 col-md-4">
<div class="kpi white-2 qvobject" id="TXT-Origem" style="box-shadow:none">
</div>
</div>
<div class="col-xs-12 col-sm-6 col-md-4">
<div class="kpi white-2 qvobject" id="TXT-DTATU" style="box-shadow:none">
</div>
</div>
<div class="col-xs-12 col-sm-6 col-md-4">
<div class="kpi white-2 qvplaceholder" id="BT-EXPO" style="box-shadow:none">
</div>
</div>
</div>
</div>
</div>
</iron-pages>
<!-- PAGINAS ## FIM ============================================================ -->
</paper-drawer-panel>
<!-- PAGINA UTIL FIM ============================================================ -->
</paper-header-panel>
<!-- PAGINA FIM ============================================================ -->
</paper-drawer-panel>
<!-- MODAL HELP INI ============================================================ -->
<!-- Modal -->
<div aria-hidden="true" class="modal fade" id="basic" role="basic" tabindex="-1">
<div class="modal-dialog">
<div class="modal-content">
<div class="modal-header">
<button aria-hidden="true" class="close" data-dismiss="modal" type="button">
</button>
<h4 class="modal-title">
SOBRE ESTE PAINEL
</h4>
</div>
<div class="modal-body">
<p>
Este painel apresenta informações sobre a distribuição de Vacinas contra a Covid-19, a partir do Ministério da Saúde.
<br/>
<br/>
A fonte dos dados é a Secretaria de Vigilância Sanitária (SVS).
<br/>
<br/>
Informações adicionais podem ser encontradas no site do
<a href="https://saude.gov.br/">
Ministério da Saúde
</a>
.
<br/>
___________________________
<br/>
<br/>
<img src="UsoPainel.png" style="width:565px;"/>
</p>
</div>
<div class="modal-footer">
<button class="btn dark btn-outline" data-dismiss="modal" type="button">
Close
</button>
</div>
</div>
<!-- /.modal-content -->
</div>
<!-- /.modal-dialog -->
</div>
<!-- End Modal -->
<!-- MODAL HELP FIM ============================================================ -->
<div class="footer" style="z-index: 20000; height:34px; background-color:#ccc;">
<div style="position:absolute; height:25px; top:10px; left:10px; text-align:left; color:#333;">
Versão Beta - Maiores informações no site do
<a href="https://saude.gov.br/">
Ministério da Saúde
</a>
</div>
<img src="LOGO_BASE.png" style="position:absolute; height:30px; width:145px; bottom:2px; right:10px;"/>
</div>
<script>
var root = this.root;
$(document).ready(function() {
$("#nav-drawer paper-menu paper-item").click(function() {
var index = $(this).index();
Polymer.dom(root).querySelector("iron-pages").selectIndex(index);
});
$("#nav-menu-button").click(function() {
Polymer.dom(root).querySelector("#nav-drawer").togglePanel();
});
$(window).resize(function() {
Polymer.updateStyles();
});
});
</script>
</body>
</html>
If I search for that div
element, I won't get the desired data.
What I need is a dictionary like this:
{"MG": 655588, "RJ":758120, ...}
The data from my example may change due update in dashboard.
How could I extract data from those charts, since they are not in any HTML tags?