-1

I write a java program like I saw here How to read the https page content using java? but for some sites the code does not work.

I got Error Server returned HTTP response code: 403 for URL: https://research.investors.com/stock-quotes/nyse-sailpoint-tech-holdings-sail.htm

It works for url = "https://maven.apache.org/guides/mini/guide-repository-ssl.html";

Can someone help me ?

alin
  • 27
  • 3

3 Answers3

1

403 Forbidden The request contained valid data and was understood by the server, but the server is refusing action. This may be due to the user not having the necessary permissions for a resource or needing an account of some sort, or attempting a prohibited action (e.g. creating a duplicate record where only one is allowed). This code is also typically used if the request provided authentication by answering the WWW-Authenticate header field challenge, but the server did not accept that authentication. The request should not be repeated.

So probably website, which you want to scrape, just restricted requests like yours (i mean requests, that was made not from browser).

But you can try Selenium.

Apollo917
  • 11
  • 2
1

403 HTTP status stands for "Forbidden", most likely investors.com can check your request headers and deny the resource.

Try modifying the request headers using an User-Agent that site might accept.

OscarRyz
  • 196,001
  • 113
  • 385
  • 569
0

OK , I solved. I use con.setRequestProperty and set "User-Agent", "Accept", "Content-Type", "Accept-Language".

Thank you.

alin
  • 27
  • 3