I realize this is an odd use case. The non-profit I work for has a google sheet that has webinar info in it. I am trying to pull this info out and then format it in HTML so they can just copy-paste paste onto the website.
Reach row is one webinar.
All the speakers are in one cell for each webinar.
I am trying to pull the names and country out and put them in a list. I am super new to python so if there I a better way I am super open to it.
Sample data from the cell:
Moderators:
Willaim Riker MD (USA) <wriker@example.com>
Deana Troy (Portugal) <dtroy@gmail.com>
Speakers:
Tasha Yar, MD PhD (Brazil) <example@example.com>
S'chn T'gai Spock, MD PhD FACS (USA) <example@gexample.com>
Leonard James Akaar, MD (Argentina) <example@gexample.com>
Worf Wo'rIv PhD (Brazil) <example@example.com>
What I have so far-using gspread
:
import re
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import json
scopes = [
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/drive'
]
credentials = ServiceAccountCredentials.from_json_keyfile_name("webinar-gs-html-16720182572b.json", scopes) #access the json key you downloaded earlier
file = gspread.authorize(credentials) # authenticate the JSON key with gspread
sheet = file.open("PAAO_Webinar_Calendar")
worksheet = sheet.get_worksheet(0)
row = input("Enter Row Number:")
values_list = worksheet.row_values(row)
#make a list of the speakers
alspkrs = worksheet.cell(row, 7).value
spkrs = re.findall('([a-zA-Z]+\s[a-zA-Z]+\s+\([a-zA-Z]+\))', alspkrs)
spkrnum = len(spkrs)
print(spkrs)
print(spkrs)
results in
['Riker MD (USA)', 'Deana Troy (Portugal)', 'MD PhD (Brazil)', 'PhD FACS (USA)',
'rIv PhD (Brazil)']
I originally tried something like for line in alspkrs:
etc but couldn't figure out how to get the names when sometimes they are 2 words, sometimes 3 or four.
Ideally, they would end up in a dictionary with keys like moderator1
, speaker1
, etc but I am not there yet.