Scraping data from sankey diagram using python and BS

Question

I am new to Python and am currently trying to figure out how to scrape data from this web:

https://www.iea.org/sankey/#?c=Indonesia&s=Balance

i have tried using BS and selenium but it didnt work. Need data that showed inside the diagram. Thank you for your answer

i tried using python and BS, i expect a table would came out but it didnt

import requests
from bs4 import BeautifulSoup

url = "https://www.iea.org/sankey/#?c=Indonesia&s=Balance"
response = requests.get(url)
html_content = response.content

soup = BeautifulSoup(html_content, 'html.parser')
data = soup.find_all('div', {'class': 'sankey-data'})[0].text

print(data)

score 0 · Accepted Answer · answered Mar 01 '23 at 07:39

0

There is no table on the page and the data is reloaded separately through additional requests (https://www.iea.org/sankey/data/Indonesia.SBBSBBBSBBS_YY.txt).

Due to the sparse information provided by the OP, also with regard to the expected output, here is a simple approach that should at least point in one direction and can be adapted to the requirements.

import pandas as pd

pd.read_csv('https://www.iea.org/sankey/data/Indonesia.SBBSBBBSBBS_YY.txt', sep='\t', header=[0,1,2,3,4,5,6])

answered Mar 01 '23 at 07:39

HedgeHog

22,146
4
14
36

can you explain how you convert it to txt file – Bagas Mar 01 '23 at 08:55
No conversion into a text file has taken place on my part, this is requested by a separate request on the part of the website. You would therefore have to deal with the XHR tab in your browser dev tools. – HedgeHog Mar 01 '23 at 10:13

Scraping data from sankey diagram using python and BS

1 Answers1