I'm using Jsoup to parse in background the html of 3 different webpages every 10 minutes or so. However I found that in 2 days I consumed 18 mb of network data... Is there some way to reduce this huge data consumption? I don't need all the html page, is there a way to download only a part of the website html?
Asked
Active
Viewed 97 times
0
-
It depends on the site's structure. Which sites are you parsing? what are the parts that you need? – TDG Jul 13 '15 at 16:32
-
But from what I understand there is no way to download just a part of the html, you have to download all the document and then select the part that you need... Anyway I found that there is a maxBodySize attribute that I can set, it may help I think – Stack Diego Jul 14 '15 at 07:19
1 Answers
0
One way out would be to create a webservice that does the scraping and parsing and offers the results back to you in a condensed form. maybe create something on openshift?
18MB is actually not much over 2 days. Are you sure you can't afford this?

luksch
- 11,497
- 6
- 38
- 53