0

Related to: How to countDistinct on multiple columns

I have an entity class that contains many fields, three of which are longitude, latitude and update_time. I am trying to add a @Formula field that concatenates the three:

@Formula("concat(longitude, latitude, update_time)")
public String fix; 

I would then like to use that field as part of a countDistinct query:

@SuppressWarnings( {"unchecked", "rawtypes"} )
public long getCountDistinctPositions() {
    Session session = sessionFactory.openSession();

    CriteriaBuilder criteriaBuilder = session.getCriteriaBuilder();

    CriteriaQuery criteriaQuery = criteriaBuilder.createQuery();
    Root<Position> position = criteriaQuery.from(Position.class);
    Expression fix = position.get("fix");
    Expression countDistinct = criteriaBuilder.countDistinct(fix);
    criteriaQuery.select(countDistinct);
    Query query = session.createQuery(criteriaQuery);
    Long result = (Long)query.getSingleResult();

    session.close();

    return result;
}

But I keep getting an exception:

java.lang.IllegalArgumentException: Unable to locate Attribute  with the the given name [fix] on this ManagedType [aaa.Position]
spierepf
  • 2,774
  • 2
  • 30
  • 52

1 Answers1

0

A @Formula field cannot be queried, its basically a synthetic field that Hibernate will populate when a select query is executed only.

For your use case, you'd need to execute:

SELECT DISTINCT longitude, latitude, update_time FROM Entity

From a JPA query perspective, you'd need to execute #countDistinct on all 3 individual columns, not on the formula-column.

The best way to think of a formula column is that you've effectively translated the following method on your entity

@Transient
public String getFix() {
  return this.longitude + this.latitude + this.update_time;
}

to a property that the database concatenates for you at query-time.

UPDATE

CriteriaQuery<Long> countQuery = cb.createQuery( Long.class );
Root<TheEntity> root = countQuery.from( TheEntity.class );

countQuery.select( cb.count( root.get( "id" ) ) );

Subquery<Integer> subQuery = countQuery.subquery( Integer.class );
Root<TheEntity> subRoot = subQuery.from( TheEntity.class );
subQuery.select( cb.min( subRoot.get( "id" ) ) );
subQuery.groupBy( subRoot.get( "longitude" ), 
   subRoot.get( "latitude" ), 
   subRoot.get( "updateTime" ) );

countQuery.where( root.get( "id" ).in( subQuery ) );

Long count = entityManager.createQuery( countQuery ).getSingleResult();
Naros
  • 19,928
  • 3
  • 41
  • 71
  • Yes, I am trying to convince Hibernate to execute something like "SELECT COUNT DISTINCT longitude, latitude, update_time FROM Position" but I cannot figure out how. Surely I don't need to open my own JDBC connection to the database and run my query on the sly? – spierepf May 18 '18 at 13:26
  • Please review my update to the answer with an example. – Naros May 18 '18 at 13:32
  • Thanks for the clarification. Concat only accepts two parameters, so I nested several of them but now I get this error: `Caused by: org.hibernate.hql.internal.ast.QuerySyntaxException: unexpected token: ( near line 1, column 23 [select count(distinct ((((generatedAlias0.longitude || :param0) || generatedAlias0.latitude) || :param1) || generatedAlias0.updateTime)) from aaa.Position as generatedAlias0 where generatedAlias0.onGround=:param2]` – spierepf May 18 '18 at 14:01
  • ultimately, I am trying to convince Hibernate/JPA to evaluate this query `SELECT COUNT(*) FROM (SELECT DISTINCT LONGITUDE, LATITUDE, UPDATE_TIME FROM POSITION) AS TEMP;` – spierepf May 18 '18 at 14:02
  • I added an example of how to do that via JPA. The most important thing is to be sure that you use the appropriate `root` or `subRoot` at the right spots to get the reference to the right entity type alias. For a situation where i have 3 entities and 2 share the same longitude, latitude, and updateTime fields, I get a result of 2. – Naros May 18 '18 at 15:47
  • thanks so much for the update. It works on test cases, but I'm concerned that when I start processing real data performance is going to suffer because we're computing min(id). Is computing min(id) necessary? – spierepf May 18 '18 at 16:26
  • Unless you can find a way to replace the predicate condition of `WHERE root.id IN ( SELECT min( subquery_root.id ) ... )` with something else, yep it is. – Naros May 18 '18 at 16:41
  • Also be aware that if performance begins to suffer, you could also look into a database trigger which handles concat into a single field for you and then you should be able to rewrite the query as `SELECT COUNT (DISTINCT triggerMaintainedField ) FROM Position` – Naros May 18 '18 at 16:48