0

I'm trying to scrape the table from following website using selenium: https://web.archive.org/web/20120220031809/http://simcentral.net/ibaf/games/1

with the code:

from selenium import webdriver as wd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup as bs
from pandas.io.html import read_html
import pandas as pd
import numpy as np
import re
os.chdir('c:/Users/Owner')
bat=pd.DataFrame()

driver = wd.Chrome()
wait = WebDriverWait(driver,15)
driver.get('https://web.archive.org/web/20120220031809/http://simcentral.net/ibaf/games/1')
page=driver.find_element_by_xpath('//*[contains(concat( " ", @class, " " ), concat( " ", "regtext", " " ))] | //*[contains(concat( " ", @class, " " ), concat( " ", "normal", " " ))]')
table_html=page.get_attribute('innerHTML')

driver.quit()

I get the following error:

StaleElementReferenceException: stale element reference: element is not attached to the page document

I looked online and understand the problem, but don't know what to do about it. The other issues appear to be pulling the elements via some way other than through xpath. I know it stops working at the table_html= line because if I remove it, whatever is above works and the browser closes as expected.

Thanks for any help.

1 Answers1

0

Try this

import pandas as pd
data = pd.read_html("https://web.archive.org/web/20120220031809/http://simcentral.net/ibaf/games/1")
dimay
  • 2,768
  • 1
  • 13
  • 22