0

I am trying to scrape this link : https://www.axisbank.com/retail/cards/credit-card

Using the following code

from urllib.request import urlopen
from bs4 import BeautifulSoup
import json, requests, re

axis_url = ["https://www.axisbank.com/retail/cards/credit-card"]

html = requests.get(axis_url[0])
soup = BeautifulSoup(html.content, 'lxml')

for d in soup.find_all('span'):
    print(d.get_text())

Output :

close
5.15%
%
4.00%
%
5.40%

Basically I want to get the details of each and every card present in that page

enter image description here

I have tried different tags but none of them seems to be working out.

I'd be happy to see the code that satisfies my requirement.

Any help is highly appreciated.

1 Answers1

1

What happens?

your main issue is, that the website serve its content dynamically and you wont get your goal, the wa you are requesting it. Print your soup and take a look, it will not contain the elements you are inspecting in the browser.

How to fix?

Use selenium that can deal with the dynamically generated content and will provide the information you have inspected:

Example

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome(executable_path=r'C:\Program Files\ChromeDriver\chromedriver.exe')
url = 'https://www.axisbank.com/retail/cards/credit-card'
driver.get(url)

soup = BeautifulSoup(driver.page_source, 'lxml')
    
driver.close()

textList = []
for d in soup.select('#ulCreditCard li li > span'):
        textList.append(d.get_text('^^', strip=True))
    
textList
HedgeHog
  • 22,146
  • 4
  • 14
  • 36
  • Thanks for your apt reply, could you please explain me as to what this selector is doing '#ulCreditCard li li > span' –  Feb 17 '21 at 07:13
  • 1
    Happy to help - You can still work with your selection, but it is not that specific as the [css selector](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors) that is pointing to the `ul` with id `ulCreditCard` and its `li`s with `li`s in it – HedgeHog Feb 17 '21 at 07:29