To practice, I am trying to scrape the following website that displays data in multiple pages. Unfortunately, I am constantly getting a 500 Internal Server Error for each page every time I try to parse the data, in this case contained in the td tags.
Here's my attempt so far. The urls are built correctly but I constantly get a 500 error so I cannot structure the data and build a dataframe that contains the td tags from each page. Any ideas on how to solve this? Thanks!
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://www.linguasport.com/futbol/nacional/liga/seekff_esp.asp?pn={}"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'}
dfs = []
for page in range(1, 20):
#print(page)
#print(url.format(page))
sleep(randint(1,5))
soup = BeautifulSoup(requests.get(url.format(page), headers=headers).content, "html.parser")
#print(soup)
tds = soup.find_all("td")
#print(tds)