error during Web scraping using python

Question

i tried to scraping reply in news.

i tried tried many time.

but i can see only Traceback.

please help me.

i wrote code like this:

import re
import urllib.request
import urllib
import requests
from bs4 import BeautifulSoup

url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1'
html=request.get(url)
#print(html.text)
a=html.text
bs=BeautifulSoup(a,'html.parser')
print(bs.prettify())
bs.find('span',class="u_cbox_contents")

when i run this : bs.find('span',class="u_cbox_contents")

i can see only many error

error is this.

SyntaxError: invalid syntax

how to i fix code to run well??

please help me.

i run this python 3.4.4 version, windows 8.1 64x

thanks for reading.

Never, ever, ever, ever use `urllib` when you can just use `requests` instead. — Akshat Mahajan, Jun 30 '16 at 02:00
@AkshatMahajan you mean tried this code? : import re import urllib.request from bs4 import BeautifulSoup url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1' html=urllib.request.urlopen(url) but is did not work. i can see same error — L.kyunam, Jun 30 '16 at 02:04
No, I mean you're making a request using the `urllib` library instead of the `requests` library. `requests` is just a lot easier to work with. Do `html = requests.get(url)`. — Akshat Mahajan, Jun 30 '16 at 02:08

shaojl7 · Accepted Answer · 2016-06-30T05:40:33.940

3

Following @AkshatMahajan advise, the below can be done using requests module instead. In addition, you can also modify the last line to find the desired element.

##import re
##import urllib.request
##import urllib
import requests
from bs4 import BeautifulSoup

url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1'
html=requests.get(url)
#print(html.text)
a=html.text
bs=BeautifulSoup(a,'html.parser')
print(bs.prettify())
print(bs.find('span',attrs={"class" : "u_cbox_contents"}))

Thanks to @DiogoMartins for pointing out the correct Python version as well

edited Jun 30 '16 at 05:40

answered Jun 30 '16 at 04:26

shaojl7

565
4
13

Have you just copied the answer @akshat gave in the comments ? – Diogo Martins Jun 30 '16 at 04:34
1

@DiogoMartins yes, i took @akshat advise in the comment to change to requests. And changed the last line, as there was invalid syntax error in the original line of code `bs.find('span',class="u_cbox_contents")` . Hope this also helps – shaojl7 Jun 30 '16 at 04:52
1

The right thing to do is to give credit to @akshat in your answer. Also, the last line at your answer would result into a SyntaxError, since the question states that he runs at python 3.4 – Diogo Martins Jun 30 '16 at 05:20
@DiogoMartins Thanks for the advise, and also pointing out the error. I should have added that in the beginning. I will edit the answer accordingly. – shaojl7 Jun 30 '16 at 05:36
really, really, apriciate to you both!!!!! i am really happy for your help the world stiil live good!!!! thanks a lot! – L.kyunam Jun 30 '16 at 05:58
Helping each other is the whole point of stackoverflow. Glad to help – Diogo Martins Jun 30 '16 at 06:11

error during Web scraping using python

1 Answers1