Extract text from anchor tag in BeautifulSoup

Question

I'm trying to extract the titles from a URL but it doesn't have a class. The following code is taken from the page source.

<a href="/f/oDhilr3O">Unatama Don</a>

The title actually does have a class but you can see that I have use index 3 as the first 3 titles aren't what I want. However, I don't want to use hard coding. But in the website the title is also a link, hence, the link above.

title_name=soup.find_all('div',class_='food-description-title')
title_list=[]

for i in range (3,len(title_name)):
    title=title_name[i].text
    title_list.append(title)

"Unatama Don" is the title I'm trying to get.

Please make it [mcve] . Good idea to check [ask] links as well — Morse, Jul 20 '18 at 21:16
`` is hyperlink not a `div` class , your code is fetching div elements not elements which you are expecting — Morse, Jul 20 '18 at 21:19
Possible duplicate of [Python: BeautifulSoup extract text from anchor tag](https://stackoverflow.com/questions/11716380/python-beautifulsoup-extract-text-from-anchor-tag) — Morse, Jul 20 '18 at 21:25
Use selenium? https://stackoverflow.com/questions/33155454/how-to-find-an-element-by-href-value-using-selenium-python driver.find_element_by_xpath('//a[@href="/f/oDhilr3O"]'); — QHarr, Jul 20 '18 at 23:21

score 0 · Answer 1 · answered Jul 25 '18 at 18:22

Here's an example of searching for an anchor element with a specific URL in BS:

from bs4 import BeautifulSoup

document = '''
  <a href="https://www.google.com">google</a>
  <a href="/f/oDhilr3O">Unatama Don</a>
  <a href="test">Don</a>
'''

soup = BeautifulSoup(document, "lxml")
url = "/f/oDhilr3O"

for x in soup.find_all("a", {"href" : url}):
    print(x.text)

Output:

Unatama Don

score 0 · Answer 2 · answered Jul 25 '18 at 20:26

The requests and bs4 modules are very helpful for tasks like this. Have you tried something like below?

import requests
from bs4 import BeautifulSoup

url = ('PASTE/YOUR/URL/HERE')
response = requests.get(url)
page = response.text
soup = BeautifulSoup(page, 'html.parser')
links = soup.find_all('a', href=True)

for each in links:
    print(each.text)

I think this has the desired outcome you are looking for. If you would like the hyperlinks as well. Add another loop and add "print(each.get('href'))" within the loop. Let us know how it goes.

Extract text from anchor tag in BeautifulSoup

2 Answers2