I'm trying to scrape a webpage that contains a table of test results using Python and BeautifulSoup, At this point I don't mind if its just raw html/un parsed data.
There is a table of results all contained within a parent DIV tag called 'test-view-grid-area'.
I got the class of name of the DIV tag from inspecting the webpage within chrome, and when viewing source of webpage its definitely correct, but when I run the below code, my results come back as:
[<div class="test-view-grid-area"></div>]
So it appears to be finding the tag but not returning its contents? I am not sure what I need to do to get the contents of the DIV class returned.
from bs4 import BeautifulSoup
import urllib3
http = urllib3.PoolManager()
url = '[url of server / webpage]')
response = http.request('GET', url, headers=headers)
soup = BeautifulSoup (response.data, 'html.parser')
grid_data = soup.find_all("div", class_="test-view-grid-area")
print(grid_data)
Edit: I've gotten a little further, I am now getting the following response directly from the script tag that returns a JSON string:
[<script class="__allSuitesOfSelectedPlan" defer="defer" type="application/json">
{"selectedOutcome":"","selectedTester":{"displayName" <etc>}</script>]
So next now I am trying to figure out how to do some regex to create my search pattern for everything between {}, then run that pattern against my initial data scrape, and then load the json string into a object.