I'd like to get items from a website with BeautifulSoup
.
<div class="post item">
The target tag is this. The tag has two attrs and white space.
First, I wrote,
roots = soup.find_all("div", "post item")
But, it didn't work. Then I wrote,
html.find_all("div", {'class':['post', 'item']})
I could get items with this,but I am nost sure if this is correct or not. is this code correct?
//// Additional ////
I am sorry,
html.find_all("div", {'class':['post', 'item']})
didn't work properly.
It also extracts class="item"
.
And, I had to write,
soup.find_all("div", class_="post item")
not =
but _=
. Although this doesn't work for me...(>_<)
Target url:
https://flipboard.com/section/%E3%83%8B%E3%83%A5%E3%83%BC%E3%82%B9-3uscfrirj50pdtqb
mycode:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from urllib.request import urlopen
from bs4 import BeautifulSoup
def main():
target = "https://flipboard.com/section/%E3%83%8B%E3%83%A5%E3%83%BC%E3%82%B9-3uscfrirj50pdtqb"
html = urlopen(target)
soup = BeautifulSoup(html, "html.parser")
roots = soup.find_all("div", class_="post item")
print(roots)
for root in roots:
print("##################")
if __name__ == '__main__':
main()