3

We're using Solr Cloud as a component of Cloudera's CDH5.5. A colleague built an application that POSTing long queries that exceed the default maxFormContentSize (200,000 bytes). I ran a ps -ef on one of the cluster nodes, and see that Solr's started with a lot of options:

usr/java/jdk1.7.0_67-cloudera/bin/java
[...etc...]
-Djetty.port=8983
[...etc...]
org.apache.catalina.startup.Bootstrap start

To increase the size of queries that Solr can handle, I'd like to increase the maxFormContentSize. If we managed Solr through the command-line, I imagine we'd be able to pass an argument like this:

-Dorg.eclipse.jetty.server.Request.maxFormContentSize=500000

But, since we're using Cloudera Manager to control/monitor our services, it seems that the configuration change should be made there. I notice that Cloudera Manager has a "Java Configuration Options for Solr Server": setting which says:

These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here.

I'd like to know:

  1. is org.eclipse.jetty.server.Request.maxFormContentSize the right parameter to change to increase the POST size for Solr?
  2. is "Java Configuration Options for Solr Server" in Cloudera Manager the right place to set this?
  3. if so, would I simply add -Dorg.eclipse.jetty.server.Request.maxFormContentSize=500000 to the "Java Configuration Options for Solr Server"?

And if I'm on the wrong track, how can we configure a CDH5.5 managed to accept larger-than-the-default queries?

Alex Woolford
  • 4,433
  • 11
  • 47
  • 80
  • Solr Cloud is using Jetty. Do you have direct access to config the jetty? – s.xie Mar 23 '16 at 02:39
  • 1
    If the query is so large perhaps the problem is the schema? In the time it takes to type 200K of query you could pop down to the library and look it up there – Gerry King Mar 23 '16 at 14:33
  • @s.xie: I'm able to edit files on the cluster nodes and see that there's a jetty.xml file. However, it's in a folder that contains 'example', and so I'm doubtful that it'll be read when I start the service: `$ locate jetty.xml /opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/share/doc/solr-doc-4.10.3+cdh5.5.1+325/example/etc/jetty.xml`. – Alex Woolford Mar 23 '16 at 14:39
  • What I meant to say was If you are submitting the text of a Stephen King novel and asking for 'more like this' you might searching by author name, genre and other metadata makes for a less verbose query. Look at the queries and workout what you can do to help the indexer and query parser work together. – Gerry King Mar 23 '16 at 18:54
  • The large query submission is a corner-case for an application that works well 99% of the time. I agree that it may be possible to re-structure the data and/or application for more efficient querying, but it would save a lot of work if we could just tweak a parameter instead. – Alex Woolford Mar 23 '16 at 19:24

0 Answers0