0

i tried to scraping reply in news.

i tried tried many time.

but i can see only Traceback.

please help me.

i wrote code like this:

import re
import urllib.request
import urllib
import requests
from bs4 import BeautifulSoup

url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1'
html=request.get(url)
#print(html.text)
a=html.text
bs=BeautifulSoup(a,'html.parser')
print(bs.prettify())
bs.find('span',class="u_cbox_contents")

when i run this : bs.find('span',class="u_cbox_contents")

i can see only many error

error is this.

SyntaxError: invalid syntax

how to i fix code to run well??

please help me.

i run this python 3.4.4 version, windows 8.1 64x

thanks for reading.

L.kyunam
  • 53
  • 12
  • Never, ever, ever, ever use `urllib` when you can just use `requests` instead. – Akshat Mahajan Jun 30 '16 at 02:00
  • @AkshatMahajan you mean tried this code? : import re import urllib.request from bs4 import BeautifulSoup url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1' html=urllib.request.urlopen(url) but is did not work. i can see same error – L.kyunam Jun 30 '16 at 02:04
  • 1
    No, I mean you're making a request using the `urllib` library instead of the `requests` library. `requests` is just a lot easier to work with. Do `html = requests.get(url)`. – Akshat Mahajan Jun 30 '16 at 02:08
  • @AkshatMahajan wow! you are genius! how to i vode to you?? – L.kyunam Jun 30 '16 at 02:18
  • 1
    You can't mark comments as an accepted answer. – Akshat Mahajan Jun 30 '16 at 02:19

1 Answers1

3

Following @AkshatMahajan advise, the below can be done using requests module instead. In addition, you can also modify the last line to find the desired element.

##import re
##import urllib.request
##import urllib
import requests
from bs4 import BeautifulSoup

url='http://news.naver.com/main/ranking/read.nhn?mid=etc&sid1=111&rankingType=popular_week&oid=277&aid=0003773756&date=20160622&type=1&rankingSectionId=102&rankingSeq=1&m_view=1'
html=requests.get(url)
#print(html.text)
a=html.text
bs=BeautifulSoup(a,'html.parser')
print(bs.prettify())
print(bs.find('span',attrs={"class" : "u_cbox_contents"}))

Thanks to @DiogoMartins for pointing out the correct Python version as well

shaojl7
  • 565
  • 4
  • 13
  • Have you just copied the answer @akshat gave in the comments ? – Diogo Martins Jun 30 '16 at 04:34
  • 1
    @DiogoMartins yes, i took @akshat advise in the comment to change to requests. And changed the last line, as there was invalid syntax error in the original line of code `bs.find('span',class="u_cbox_contents")` . Hope this also helps – shaojl7 Jun 30 '16 at 04:52
  • 1
    The right thing to do is to give credit to @akshat in your answer. Also, the last line at your answer would result into a SyntaxError, since the question states that he runs at python 3.4 – Diogo Martins Jun 30 '16 at 05:20
  • @DiogoMartins Thanks for the advise, and also pointing out the error. I should have added that in the beginning. I will edit the answer accordingly. – shaojl7 Jun 30 '16 at 05:36
  • really, really, apriciate to you both!!!!! i am really happy for your help the world stiil live good!!!! thanks a lot! – L.kyunam Jun 30 '16 at 05:58
  • Helping each other is the whole point of stackoverflow. Glad to help – Diogo Martins Jun 30 '16 at 06:11