I am trying to use Python's BeautifulSoup library to extract HTML from my LinkedIn "Recently Added Connections" Page. Specifically, I want the name of the most recent connection - it appears towards the top of the page.
When I inspect the HTML for this specific section, what I see wrapping the content is:
<span class="mn-connection-card__name t-16 t-black t-bold">
Bob McBobface
</span>
However, the HTML I get back with BeautifulSoup is disappointing:
{"request":"/voyager/api/configuration","status":200,"body":"bpr-guid-3322365"}
{"status":401}
I've tried fiddling with the Requests library, but to no avail. I'm a beginner, so I'm hoping I don't need to spend a few weeks learning about OAuth or Selenium.
Here's my code:
from bs4 import BeautifulSoup
import urllib.request
url = "https://www.linkedin.com/mynetwork/invite-connect/connections/"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
#print(soup)
content_list = soup.find_all('span',class_="mn-connection-card__name t-16 t-black t-bold")
print(content_list)
Running this returns an empty list: [], whereas I would expect: "Bob McBobface".
When I print(soup)
, it just returns a short HTML blurb with the 401-Error notice you see above.
Any advice?