I've written an application that parses large XML files in PHP using XMLReader.
Generally, the application works great, EXCEPT when I attempt to read a file that's larger than 2gb.
(I haven't figured out exactly where the cut-off is; it works flawlessly on a 500mb file, but fails on the next largest file I have - 2.5gb).
Specifically, if my code looks like this:
$reader = new XMLReader();
if ($reader->open("big.xml")) {
echo "Success!";
$reader->close();
} else {
echo "Failed!";
}
If I test the large (>2gb) file - I get this:
Warning: XMLReader::open() [xmlreader.open]: Unable to open source data in [php script]
And of course, Failed!
is output.
If I try with a smaller (500mb) file - I get only the Succcess!
output.
As far as I can tell - there's no difference between the large files that can't be opened and the medium-size files that can be opened (e.g. permissions, valid XML, encoding) EXCEPT the size of the file.
While the size of the file is large - the nodes are all tiny, so I don't think any single node would cause a memory issue.