0

i want to delete character from my ouput like i want just the link sorry for my bad english

from bs4 import BeautifulSoup
import requests

url = 'https://cryptofaucets.cash/2021/02/01/best-free-bitcoin-faucets/'
r = requests.get(url).text
soup = BeautifulSoup(r,'lxml')
# print(soup.prettify())
l = []
for link in soup.find_all('h2'):
    print(link.find_all('a')[-1])

i want ouput like this

https://satoshihero.com/en/register?

not like this

<a href="https://satoshihero.com/en/register?r=2eedd708" rel="noopener" target="_blank">SatoshiHero.com</a>

how to achieve that?

1 Answers1

0

Either you can find all links inside that loop and try this

for link in links.findAll('a'):
    print link.get('href')

OR

link.find_all('a')[-1].get('href')

Refer this answer: https://stackoverflow.com/a/3075568/6695297

Shubham Srivastava
  • 1,190
  • 14
  • 28
  • How does this eliminate the query parameters? – DarkKnight Oct 17 '22 at 08:06
  • In newer code avoid old syntax `findAll()` instead use `find_all()` or `select()` with `css selectors` - For more take a minute to [check docs](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#method-names) – HedgeHog Oct 17 '22 at 08:31
  • @HedgeHog Traceback (most recent call last): File "web.py", line 10, in for links in link.find_all('a')[-1]: IndexError: list index out of range <--- can you solve this – John Fitzgerald Kennedy Oct 17 '22 at 08:55
  • @HedgeHog here my code --> from bs4 import BeautifulSoup import requests url = 'https://cryptofaucets.cash/2021/02/01/best-free-bitcoin-faucets/' r = requests.get(url).text soup = BeautifulSoup(r,'lxml') # print(soup.prettify()) l = [] for link in soup.find_all('h2'): for links in link.find_all('a')[-1]: print((f"https://{links.text}")) – John Fitzgerald Kennedy Oct 17 '22 at 08:56
  • @JohnFitzgeraldKennedy This is IndexError that means when you're trying to access an element which is not present on that index. In this case link.findall('a') returns blank list and in blank list you cannot access -1 index. – Shubham Srivastava Oct 18 '22 at 09:03