I'm scraping a website to get the name, birth + death dates, and the name of the cemetery someone is buried in. For the most part, it is working quite well; however, when I exported the text to a CSV, I noticed that there's a blank cell inserted in the name column after each page. I have a feeling this is probably related to the loop rather than an html tag, but I'm still learning. Any advice is welcome! Thanks everyone
Here's an example of the problem in excel
from dataclasses import replace
import requests
from bs4 import BeautifulSoup
import csv
api = 'https://www.findagrave.com/memorial/search?'
name = 'firstname=&middlename=&lastname='
years = 'birthyear=&birthyearfilter=&deathyear=&deathyearfilter='
place = 'location=Yulee%2C+Nassau+County%2C+Florida%2C+United+States+of+America&locationId=city_28711'
memorialid = 'memorialid=&mcid='
linkname = 'linkedToName='
daterange = 'datefilter='
plotnum = 'orderby=r&plot='
page = 'page='
url = api + name + "&" + years + "&" + place + "&" + memorialid + "&" + linkname + "&" + daterange + "&" + plotnum + '&' + page
for page_no in range(1,93):
url_final = url + str(page_no)
page = requests.get(url_final, headers = headers)
#print(page)
soup = BeautifulSoup(page.content, "html.parser")
graves = soup.find_all('div', {'class':'memorial-item py-1'})
#print(graves)
#Getting the Names
grave_name = soup.find_all('h2', {'class':'name-grave'})
#Dates
dates = soup.find_all('b', {'class':'birthDeathDates'})
#Graveyard Name
grave_yard = soup.find_all('button', {'role': 'link'})
#print(grave_yard)
dataset = [(x.text, y.text, z.text) for x,y,z in zip(grave_name, dates, grave_yard)]
with open('Fernandiabeach3.csv', 'a',) as csvfile:
writer = csv.writer(csvfile)
writer.writerows(dataset)
I've tried to see if there were any similar tags happening at the beginning of each new page, but I couldn't find anything that stood out.