Python fetch data 403

Question

I am trying to fetch data from a webpage using urllib2. The page is visible on the browser but through the script I keep getting HTTPError: HTTP Error 403: Forbidden

I also tried mimicking a browser request by changing the user-agent string but no success.

Any ideas on this?

Does the site requires authentication? If yes how are users being tracked? Does the site uses cookies to track authenticated users? If yes you need to send a cookie along with your HTTP request. — Darin Dimitrov, Dec 28 '10 at 12:40
Can you give some more details of the website and code which you are using to access the above mentioned site. It may not be an issue with User-Agent. — Senthil Kumaran, Dec 28 '10 at 12:40
@Darin - no authemtication required. Cookies, i will have to check. This is the url of the page I am trying to fetch. http://www.nseindia.com/content/fo/fo_underlyinglist.htm — zubinmehta, Dec 28 '10 at 12:43

score 2 · Accepted Answer · answered Dec 28 '10 at 13:19

I tried with tamper data and firefox to send only user agent, and I get 403. Try to add other headers:

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive

I tried, and this should work.

score 1 · Answer 2 · answered Dec 28 '10 at 12:49

1

The site is checking your User-Agent just set it to Internet Explorer:

request.add_header('User-Agent', 'Internet Explorer')

I confirmed that this works with wget, and you get 403 unless you set your user agent to Internet Explorer.

answered Dec 28 '10 at 12:49

ismail

46,010
9
86
95

score 0 · Answer 3 · answered Dec 31 '10 at 14:32

0

:) Am trying to get quotes from NSE too ! like pythonFoo says you need additional headers. Hower only Accept is sufficient. The user-agent can say python ( stay true ! )

answered Dec 31 '10 at 14:32

Samvid

1

Python fetch data 403

3 Answers3