The content is inside an iframe and updated via js (so not present in initial request). You can use the same link the page is using to obtain iframe content (the iframe src
). Then extract the string from the script tag that has the info and load with json
, extract the description
(which is html) and pass back to bs to then select the h2
tags. You now have the rest of the info stored in the second soup object as well if required.
import requests
from bs4 import BeautifulSoup as bs
import json
r = requests.get('https://careersus-endologix.icims.com/jobs/2034/associate-supplier-quality-engineer/job?mobile=false&width=1140&height=500&bga=true&needsRedirect=false&jan1offset=0&jun1offset=60&in_iframe=1')
soup = bs(r.content, 'lxml')
script = soup.select_one('[type="application/ld+json"]').text
data = json.loads(script)
soup = bs(data['description'], 'lxml')
headers = [item.text for item in soup.select('h2')]
print(headers)
