I'm trying to find a method that sticks to the find
method as it is the most convenient & adaptable. The problem here is that the HTML comments mess up the engine. Manually remove comments would be helpful.
from bs4 import BeautifulSoup, Comment
bs = BeautifulSoup(
"""
<a class="accordion-item__link" href="/identity-checking/individual"><!-- react-text: 178 -->Australia<!-- /react-text --></a>
""",
"lxml"
)
# find all HTML comments and remove
comments = bs.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
r = bs.find('a', text='Australia')
print(r)
# <a class="accordion-item__link" href="/identity-checking/individual">Australia</a>
The method to remove comments came from here How can I strip comment tags from HTML using BeautifulSoup?
If the comments are meant to be preserved, you may work on a copy of soup.