1

I am trying to scrape the image link from the below link but I am not able to

Link : https://www.online.citibank.co.in/credit-card/rewards/citi-rewards-credit-card?eOfferCode=INCCCCTWAFCTRELM

I have used the below code

x = ' https://www.online.citibank.co.in/credit-card/rewards/citi-rewards-credit-card?eOfferCode=INCCCCTWAFCTRELM'
html = urlopen(x)
soup = BeautifulSoup(html, 'lxml')
print(soup.find('div', class_ = "m-top-sm block-hero-art-2 display-image"))

Output:

<img _ngcontent-c11="" alt="Citi Logo" class="logo" crossorigin="anonymous" src="https://www.cdn.citibank.com/v1/ingcb/cbol/files/images/logos/logo.png?_bust=2021-01-21T05-05-29-195Z"/>

But this is a wrong link in src that I am getting and it is not the image link.

The highlighted part in the HTML code is where the image link resides. I'd be glad if I get the right code to scrape the image link.

Image to be scraped with the tag

Which tag should be used so that get that exact image link ?

Could any one help me with the alternate code with which I could get the desired result ?

Ali Baba
  • 85
  • 11
  • 1
    This card image is added dynamically by JS so `bs4` doesn't see this in the source `HTML`. In other words, just turn JavaScript off on that site and see what *actually* is there. – baduker Feb 11 '21 at 10:53

1 Answers1

1

as per @baduker comment card image is added dynamically by JS so bs4 doesn't see this in the source HTML.so you should try selenium with bs4

from bs4 import BeautifulSoup
from urllib.request import urlopen
from selenium import webdriver
x = ' https://www.online.citibank.co.in/credit-card/rewards/citi-rewards-credit-card?eOfferCode=INCCCCTWAFCTRELM'
wb = webdriver.Chrome()
wb.get(x)

soup = BeautifulSoup(wb.page_source, 'lxml')
print(soup.find('div', class_ = "m-top-sm block-hero-art-2 display-image"))
print(soup.find('div', class_ = "m-top-sm block-hero-art-2 display-image").find('img').get('src'))

To install selenium, run this in your terminal or follow the above link.

pip install selenium
Samsul Islam
  • 2,581
  • 2
  • 17
  • 23
  • `Traceback (most recent call last): File "C:\Users\Hari\PycharmProjects\Card_Prj\venv\lib\site-packages\selenium\webdriver\common\service.py", line 72, in start self.process = subprocess.Popen(cmd, env=self.env, File "C:\Python39\lib\subprocess.py", line 947, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Python39\lib\subprocess.py", line 1416, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified` Above code gave me this error. – Ali Baba Feb 11 '21 at 11:23
  • 1
    follow https://www.easeus.com/resource/the-system-cannot-find-the-file-specified.html – Samsul Islam Feb 11 '21 at 11:34
  • Can't I get the desired result by just using beautiful soup ? – Ali Baba Feb 11 '21 at 11:38
  • 1
    yes, you need selenium to get the dynamic content – Samsul Islam Feb 11 '21 at 11:40
  • 1
    I followed your advice and now I am able to scrape the image links. Thanks a lot!!! – Ali Baba Feb 11 '21 at 13:22