I want to scrape the HTML codes from the URL listed below. The problem is, I get this error:-
Aug 14, 2016 6:40:36 PM booksscraper.BooksScraper main SEVERE: null org.jsoup.HttpStatusException: HTTP error fetching URL. Status=504, URL=http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10293&langId=-1&programId=636&termId=100043741&divisionDisplayName=%20&departmentDisplayName=ACCG&courseDisplayName=16971§ionDisplayName=P15%20DAVIS&demoKey=d&purpose=browse at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:590) at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540) at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227) at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216) at booksscraper.BooksScraper.main(BooksScraper.java:52)
I have set the timeout to infinity, but that did not help. The HTML code for this website is extremely large i.e. 14833 lines of code. Is this the reason for the problem?
String url = "http://www.bkstr.com/webapp/wcs/stores/servlet/CourseMaterialsResultsView?catalogId=10001&categoryId=9604&storeId=10293&langId=-1&programId=636&termId=100043741&divisionDisplayName=%20&departmentDisplayName=ACCG&courseDisplayName=16971§ionDisplayName=P15%20DAVIS&demoKey=d&purpose=browse";
Document doc = Jsoup.connect(url)
.maxBodySize(0)
.timeout(0)
.get();
System.out.println(doc);