I read that SaxParser is good for big files. But as we can see in documentation, methods of ContentParser
class are using 32-bit
data types for storing position in file. Wikipedia dump can be file a lot bigger than 2GB
. Therefore, is this safe to use SaxParser
with Wikipedia Dumps? If not - what should I use?
Asked
Active
Viewed 79 times
1

Krzysztof Stanisławek
- 1,267
- 4
- 13
- 27
-
Is the wikipedia dump actually one big XML file? If not, then SAX will be fine. – beerbajay Jun 06 '14 at 16:41