I am building a little tool to scrape a TTRPG website for data and write the data into a google sheet. This is my code thus far :
import requests
from bs4 import BeautifulSoup
import gspread
gc = gspread.service_account(filename='credentials.json')
sh = gc.open('D&D_Tables').sheet1
url = 'https://www.d20srd.org/srd/monsters/achaierai.htm'
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
# line below uses Beautiful Soup to locate table entries within HTML, returns all results as text
monster_stats_table = soup.find('table', class_='statBlock').text
# line below converts to dictionary, other program returns an error
new_mst =[monster_stats_table]
sh.append_row(new_mst) # currently appends all information to one cell, needs to be broken up
The information shows up in one cell, stretched out over dozens of rows, with a lot of extra whitespace. I have tried several methods to remove the whitespace and format the data correctly, but nothing seems to be working out. Showing Problem I am trying to have the table look something like this instead: Correct Table Thank you for any help or suggestions you can provide. :)
I have attempted to use .strip method, as well as both importing json library and (separately) importing ast library to use funtions suggested to remove the whitespace. Neither could return output as a result of the formatting of the raw data. I am thinking I need to find a way to write the data to a json object and then find a way to import that into the sheet, but I am not certain that is the best way, or how to do that.