0

Please fill free to tell me that i'm new, i started python last week, so sorry if that question is a neophyt one. I'm asking how to scrap something with a "<li" html tag selection. In the following website (https://www.investing.com/indices/france-40-technical) i would like to scrap the RSI value at the bottom left table for all the possible time units under "CAC40 Technical Analysis".

In my novice opinion, i have to select the "li" that i want into python, but i can't find how to.

But i can't find how to select time units in those tags. You could find my code below.

Could you help the debutant i am ?

Thanks !

from bs4 import BeautifulSoup
import requests

url="https://www.investing.com/indices/france-40-technical"

headers = {
    'authority': 'www.investing.com',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'accept-language': 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7',
    'cache-control': 'max-age=0',
    'referer': 'https://www.investing.com/indices/france-40',
    'sec-ch-ua': '"Chromium";v="106", "Google Chrome";v="106", "Not;A=Brand";v="99"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-user': '?1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
}
reponse=requests.get(url,headers=headers)

if reponse.ok:
    print("Réponse site ok")

    soup=BeautifulSoup(reponse.text,"html.parser")

    cac40=soup.find('span',{'id':'last_last'}).get_text().strip()

    print(cac40)
    
    rsicac40_5hours_select1=soup.find('div',{'id':'technicalstudiesSubTabs'})
    rsicac40_5hours_select2=rsicac40_5hours_select1.find('li',{'data-period':'18000'})
    rsicac40_5hours_div=rsicac40_5hours_select2.find('div',{'class':'halfSizeColumn float_lang_base_1'})
    rsicac40_5hours_class=rsicac40_5hours_div.find('td',{'class':'right'}).get_text().strip()
    soup.select()
    print(rsicac40_5hours_class)

Concerning the error code, it's the following one :

"

line 36, in <module>
    rsicac40_5hours_class=rsicac40_5hours_div.find('td',{'class':'right'}).get_text().strip()
AttributeError: 'NoneType' object has no attribute 'find'

"

Kyra
  • 1
  • 1
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Oct 26 '22 at 08:35

0 Answers0