2

Is there a maximum size for a query collection parameter in JPA 2.1 / EclipseLink 2.5.2 (connecting to mysql 5.6, if that matters)?

e.g., what is the maximum allowable and/or recommend size for ids in the following code, to ensure performance and to avoid any possible bugs or issues in JPA / EclipseLink / mysql?:

@PersistenceContext
private EntityManager em;

{
    final Collection<Long> ids = /*obtain a Collection of IDs*/;

    em.createQuery("select e from Entity1 e where e.id in :ids")
        .setParameter("ids", ids);
}
XDR
  • 4,070
  • 3
  • 30
  • 54
  • I've read that the limit in Oracle is 1000 parameters, but I have not seen a limit for Mysql. You will have to test with your database to see how higher values affect performance. – Chris Oct 09 '14 at 12:48

4 Answers4

5

There is no simple and clear answer to your question as it depends on multiple factors like JVM's process heap size, persistence provider proprietary features, the underlying database tuning options, etc. In a real-life all these factors should be tuned up rather individually.


MySQL 5.6 Reference Manual, chapter 12.3.2 Comparison Functions and Operators defines that:

The number of values in the IN list is only limited by the max_allowed_packet value.

As max_allowed_packet is equal to the maximum size of one packet or any generated/intermediate string (here: 1GB) you should be theoretically allowed to send a query string of maximum size of 1GB.


JPA 2.0 specification (JSR-317), chapter 4.6.9 In Experssions does not mention anything about the limitations, so at least we can assume that it's a matter of the "weakest" component in the tool-chain (application container, persistence provider, underlying database, JVM heap size).


As noticed by @Chris it is limited in Oracle to 1000 parameters per statement. The similar issue with EclipseLink has been described by James (a former architect of TopLink/EclipseLink) on his blog:

The first thing that I noticed in this run was that Oracle has a limit of 1,000 parameters per statement. Since the IN batch fetching binds a large array, and I'm reading 5,000 objects, this limit was exceeded and the run blew up with a database error. The BatchFetchPolicy in EclipseLink accounts for this and defines a size for the max number of ids to include the an IN. The default size limit in EclipseLink was suppose to be 500 (...)

and

EclipseLink defines a JPA Query hint "eclipselink.batch.size" that allows the size to be set. So I will set this to 500 for the test. This means that to read in all of the 5,000 objects, the IN batch fetch will need to execute 10 queries per batch fetched relationship.

I encourage to read the whole post anyway.

Community
  • 1
  • 1
wypieprz
  • 7,981
  • 4
  • 43
  • 46
  • Thanks for the detailed info. It seems that `BatchFetchPolicy#size` is only used for batch fetching of entity relationships using `BatchFetchType.IN`. Can EclipseLink 2.5.2 be configured to partition a collection used as a query value parameter? e.g., if I configure EclipseLink to partition into 500 element chunks, and if I pass in a 10,000 item `Collection` as a query parameter value, when I call `getResultList()` on the query, I want EclipseLink to run the query 20 times with 500 elements for each run, then concatenate the results into one returned `List`. – XDR Oct 12 '14 at 10:15
  • The scenario you are describing reminds a typical 'pagination'. I am afraid JPA/EclipseLink don't internally support such functionality, but rather delegate it outside by providing `Query` interface with `setMaxResults` and `setFirstResult` methods. Take a look at it from more generic perspective - the problem is specific to a given environment therefore it wouldn't be possible by EclipseLink or MySQL to figure out the perfect boundaries in which all queries will work in the most performant way. Usually default settings satisfy majority of environments, which eventually may be tuned up. – wypieprz Oct 12 '14 at 18:46
0

The "limit" is whatever the limit in SQL is in your database. Look at the SQL being created by your JPA provider, and then look at the limit for that syntax in the RDBMS you use

0

Use setMaxResults() method.

For Example Result is as below

1, 2, 3, 4, 5, 6, 7, 8 , 9, 10

Only Max Result

em.createQuery(...).setParameter(...).setMaxResults(5).getResultList();

Output

1, 2, 3, 4, 5

Usage of Range

em.createQuery(...).setParameter(...).setFirstResult(3).setMaxResults(5).getResultList();

Output

4, 5, 6, 7, 8

Zaw Than oo
  • 9,651
  • 13
  • 83
  • 131
  • This is not what I'm looking for. I don't need to limit the output of a JPA query. I need to know how many items I can safely pass in as members of a single input collection parameter to the query. – XDR Oct 09 '14 at 17:14
0

Other frameworks such as openJPA are able to split the SQL transparently for the user, hence if a list of more than 1000 items is supported in a IN clause with placeholder, its splitting automatically. Would be nice if EclipseLink would support such a feature as well, as it could be automated by the implementation. Shifting the workload to the user, dependent on the underlaying database, is not developer-friendly.