I have a script that tries to get a Twitter handle's name. Here's the script:
import requests, re, bs4, lxml
from bs4 import BeautifulSoup
url = 'https://web.archive.org/web/20150623154546/https:/twitter.com/biz/status/600839913286672384'
r = requests.get(url).text
c = re.compile('.*fullname js-action-profile-name show-popup-with-id.*')
user = bs4.BeautifulSoup(r, "lxml").find("strong", {"class": c}).getText()
print(user)
Now, this is what the Element Inspector looks like:
I want to print only the text highlighted above in yellow (Biz Stone
). However, this is my Python output:
Biz StoneVerified account
As you can see, the string "Verified account"
is appended, for an unfathomable reason. The text in Inspect Element only shows "Biz Stone"
. Why does this occur, and how can I fix it?
I tried fixing it by following the instructions in this solution and this one. Unfortunately, the first one (adding text=True
) printed random people's names, while adding to it "recursive=False"
as in the second solution prints nothing.