0

when running nutch jobs it is showing as

Oct 13, 2020 8:46:18 AM org.apache.tika.utils.XMLReaderUtils acquireSAXParser WARNING: Contention waiting for a SAXParser. Consider increasing the XMLReaderUtils.POOL_SIZE May I know what it means.I using num threads as 150 and numfetchers are 3.Should I need to change this parameters.Let me know.

Ravi Kiran
  • 65
  • 6
  • What code are you using to call Apache Tika? Without that code, it's hard to tell you how to increase the pool size... – Gagravarr Oct 13 '20 at 14:12
  • 1
    See https://issues.apache.org/jira/browse/NUTCH-2582, it's an open issue to make Nutch adjust Tika's XMLReaderUtils.POOL_SIZE. Given a default pool size of 10 this warning can be ignored if there are not significantly more than 10 CPU cores available, even if the number of parsing fetcher threads is larger. It doesn't matter whether threads are waiting for a SAXParser or later for CPU resources when parsing. – Sebastian Nagel Oct 14 '20 at 11:51

0 Answers0