0

I am using Goutte v2.0.4 which is a wrapper for Symfony2 DomCrawler. I have the html files locally stored. Some of them are below 10MB; I have crawled those files successfully.

Other files are above 30MB. These are not getting crawled. This may be a file size issue. All files have similar formatting. So what's wrong? How to crawl big sized files?

Tejas
  • 2,215
  • 2
  • 18
  • 27
  • Do you mean your link discoverer is not finding large files at all, or that Goutte is attempting them and fails with an error? You can add a HTTP subscriber to Goutte to log what it tries and what it fails, so if you don't already have that, it would be a good addition. – halfer May 09 '15 at 12:36
  • It tries and stops without showing any error. I didn't knew about HTTP subscriber, I'll look into that. v2.0.4 – Tejas May 09 '15 at 12:41
  • Do you receive any errors when you attempt to parse files >10 Mb? If so what is the error message that you receive? – Shaun Bramley May 18 '15 at 01:08

0 Answers0