0

I'm using Jsoup to parse in background the html of 3 different webpages every 10 minutes or so. However I found that in 2 days I consumed 18 mb of network data... Is there some way to reduce this huge data consumption? I don't need all the html page, is there a way to download only a part of the website html?

Stack Diego
  • 1,309
  • 17
  • 43
  • It depends on the site's structure. Which sites are you parsing? what are the parts that you need? – TDG Jul 13 '15 at 16:32
  • But from what I understand there is no way to download just a part of the html, you have to download all the document and then select the part that you need... Anyway I found that there is a maxBodySize attribute that I can set, it may help I think – Stack Diego Jul 14 '15 at 07:19

1 Answers1

0

One way out would be to create a webservice that does the scraping and parsing and offers the results back to you in a condensed form. maybe create something on openshift?

18MB is actually not much over 2 days. Are you sure you can't afford this?

luksch
  • 11,497
  • 6
  • 38
  • 53