The best way of getting the data from a google spreadsheet in python is by using gspread, which is a Python API for Google Sheets.
However, there are alternatives if you aren't the owner of the spreadsheet (or you just want to do it by other method as an exercise). For instance, you can do it using requests
and bs4
modules as you can see in this answer.
Applied to your specific case, the code would look like this ("Datos Argentina - Salas de Cine" spreadsheet):
import typing
import requests
from bs4 import BeautifulSoup
def scrapeDataFromSpreadsheet() -> typing.List[typing.List[str]]:
html = requests.get('https://docs.google.com/spreadsheets/d/1o8QeMOKWm4VeZ9VecgnL8BWaOlX5kdCDkXoAph37sQM/edit#gid=1691373423').text
soup = BeautifulSoup(html, 'lxml')
salas_cine = soup.find_all('table')[0]
rows = [[td.text for td in row.find_all("td")] for row in salas_cine.find_all('tr')]
return rows
Important note: with the link provided (and the code above) you will only be able to get the first 100 rows of data!
This can be fixed in more than one way. What I've tried is modifying the url of the spreadsheet to display the data as a simple html table (reference).
Old url: https://docs.google.com/spreadsheets/d/1o8QeMOKWm4VeZ9VecgnL8BWaOlX5kdCDkXoAph37sQM/edit#gid=1691373423
New url: (remove edit#gid=1691373423
and add gviz/tq?tqx=out:html&tq&gid=1
) https://docs.google.com/spreadsheets/d/1o8QeMOKWm4VeZ9VecgnL8BWaOlX5kdCDkXoAph37sQM/gviz/tq?tqx=out:html&tq&gid=1
Now you are able to obtain all the rows that the spreadsheet contains:
def scrapeDataFromSpreadsheet() -> typing.List[typing.List[str]]:
html = requests.get('https://docs.google.com/spreadsheets/u/0/d/1o8QeMOKWm4VeZ9VecgnL8BWaOlX5kdCDkXoAph37sQM/gviz/tq?tqx=out:html&tq&gid=1').text
soup = BeautifulSoup(html, 'lxml')
salas_cine = soup.find_all('table')[0]
rows = [[td.text for td in row.find_all("td")] for row in salas_cine.find_all('tr')]
return rows