BeautifulSoup
has CSS selectors support built-in:
>>> from bs4 import BeautifulSoup
>>> from urllib2 import urlopen
>>> soup = BeautifulSoup(urlopen("https://google.com"))
>>> soup.select("input[name=q]")
[<input autocomplete="off" class="lst" maxlength="2048" name="q" size="57" style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top" title="Google Search" value=""/>]
There is also cssselect
package that you can use in combination with lxml
.
Note that there are certain limitations in how CSS selectors work in BeautifulSoup
- lxml
+csselect
support more CSS selectors:
This is all a convenience for users who know the CSS selector syntax.
You can do all this stuff with the Beautiful Soup API. And if CSS
selectors are all you need, you might as well use lxml directly: it’s
a lot faster, and it supports more CSS selectors. But this lets you
combine simple CSS selectors with the Beautiful Soup API.