My code:
def check_xyz_status(user_input):
user_input_list = user_input.split(',', maxsplit=1)
if len(user_input_list) != 2:
return f"Input is incorrect! Please type your username (email address) followed by a comma (,) and password. Example: yourname@email.com,MyPa55word)"
elif user_input_list[0] == "":
return f"Input is incorrect! Please type your username (email address) followed by a comma (,) and password. Example: yourname@email.com,MyPa55word)"
elif user_input_list[1] == "":
return f"Input is incorrect! Please type your username (email address) followed by a comma (,) and password. Example: yourname@email.com,MyPa55word)"
with sync_playwright() as p:
browser = p.chromium.launch(headless=True,channel="chrome",executablePath='/app/vendor/chrome/bin/chrome')
#
print(browser)
page = browser.new_page()
page.goto('SITE_URL',timeout=300000) #,timeout=100000
page.fill('input#UserName',user_input_list[0].strip())
page.fill('input#Password',user_input_list[1].strip())
page.click('input[type=submit]')
try:
page.goto('SITE_URL/PAGE',timeout=300000) #,timeout=500000
page.is_visible('div.k-grid-content k-auto-scrollable')
html = page.inner_html('#elmnt')
except:
return 'Incorrect username(email) or password.'
soup = BeautifulSoup(html,'html.parser')
tds = soup.find_all('td')
status_list = []
for td in tds:
status_list.append(td.text)
if len(status_list) == 0:
return 'Server is busy, try again later :)'
else:
return f'Your application\'s status is {status_list[3]}({status_list[5]} for {status_list[6]}), submitted on {status_list[4]}'
This works as expected on the local machine(obviously). When I deploy it in Heroku the web browser part does not return anything. The purposes is to login to a website and fetch some data. User input validation parts written in the code are working(because it has nothing to do the browser object).
Here is the issue:
browser = p.chromium.launch(headless=True)
I did some search and found the Heroku does not have browser to perform the actions written in my code. Based on some suggestion I have updated my buildpacks with the following,
https://github.com/mxschmitt/heroku-playwright-buildpack.git
https://github.com/heroku/heroku-buildpack-chromedriver
https://github.com/heroku/heroku-buildpack-google-chrome
https://github.com/minted/heroku-buildpack-chrome-headless
But no luck. I even changed aforementioned code and not working.
with sync_playwright() as p:
browser = p.chromium.launch(headless=True,channel="chrome",executablePath='/app/vendor/chrome/bin/chrome')
I am naive in web scraping. Any help would be highly appreciated.