Slightly different approach than the BeautifulSoup version below to give you options.
I like BeautifulSoup to parse, until I see <table>
tags. Then I usually just go to Pandas to get the table as it can be done in 1 line, then I can just manipulate the dataframe as needed.
Then can just convert the dataframe to json (actually learned this from an ewwink solution a few weeks back :-) )
import pandas as pd
import requests
import json
url = 'https://bgp.he.net/country/US'
session = requests.Session()
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
"Accept-Encoding": "gzip, deflate",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Language": "en"}
response = session.get(url, headers=headers)
tables = pd.read_html(response.text)
table = tables[0]
table['Country'] = url.split('/')[-1]
jsonObject = table.to_dict(orient='records')
# if you need as string to write to json file
jsonObject_string = json.dumps(jsonObject)
Output:
[{'ASN': 'AS6939', 'Name': 'Hurricane Electric LLC', 'Adjacencies v4': 7216, 'Routes v4': 127337, 'Adjacencies v6': 4460, 'Routes v6': 28227, 'Country': 'US'}, {'ASN': 'AS174', 'Name': 'Cogent Communications', 'Adjacencies v4': 5692, 'Routes v4': 118159, 'Adjacencies v6': 1914, 'Routes v6': 8814, 'Country': 'US'}...