Find html tag using substring with beautifulsoup in python3

Question

With the following code:

url ='http://lampspw.wallonie.be/dgo4/site_ipic/index.php/fiche/index?sortCol=2&sortDir=asc&start=0&nbElemPage=10&filtre=&codeInt=62121-INV-0018-02'
soup = BeautifulSoup(page.content, 'html.parser')
t = soup.find_all("div", attrs={'class':'panel-heading'})
lst = [x.text for x in t]

I obtain:

['\xa0Filtres complémentaires',
 '\xa0Recherche dans les notices',
 'Libellé(s)\xa0',
 'Illustration(s)',
 'Localisation',...]

If I look for a particular tag (contained in that list) directly in soup with a substring:

In [290]: soup.find_all("div", string=re.compile('Locali'))
Out[291]: [<div class="panel-heading">Localisation</div>]

I find back one of the previous tag I want. But if i do:

In :soup.find_all("div", string=re.compile('Libe'))
Out: []

Can someone explain the problem here? I guess it lies within the html code, but I do not find it...

`soup.find_all(string=re.compile('Libe'))` will get the result , it may cause by tag. — KC., Oct 26 '18 at 07:24
It may be an issue on bs4 , because i found `html = '
Localisation
'` will be not found by using `soup.find_all("div", string=re.compile('Locali'))` — KC., Oct 26 '18 at 11:57

score 0 · Accepted Answer · answered Oct 26 '18 at 11:44

0

Thanks to kcorlidy: soup.find_all(string=re.compile('Libe')) will get the result

answered Oct 26 '18 at 11:44

francois

43
4

Find html tag using substring with beautifulsoup in python3

1 Answers1