How to build query with CriteriaBuilder for a query with a subquery that returns dynamic number of columns?

Question

I have the following query (it finds the row with the most similar name and selects it plus the value and rating):

SELECT
  DISTINCT FIRST_VALUE(value) over (partition by name order by distance asc) value,
  rating,
  name
FROM (
  SELECT
    value,
    rating,
    LEAST(
      UTL_MATCH.EDIT_DISTANCE('pineapple',   upper(name)),
      UTL_MATCH.EDIT_DISTANCE('yellowapple', upper(name)),
      UTL_MATCH.EDIT_DISTANCE('greenapple',  upper(name)),
    ) as apple,
    LEAST(
      UTL_MATCH.EDIT_DISTANCE('carbage',  upper(name)),
      UTL_MATCH.EDIT_DISTANCE('cabbagge', upper(name)),
      UTL_MATCH.EDIT_DISTANCE('bagge',    upper(name)),
    ) as cabbage,
    LEAST(
      UTL_MATCH.EDIT_DISTANCE('chakn',  upper(name)),
      UTL_MATCH.EDIT_DISTANCE('chkn',   upper(name)),
      UTL_MATCH.EDIT_DISTANCE('chikee', upper(name)),
    ) as chicken
    -- more of the above could follow...
  FROM food_table
) UNPIVOT (distance FOR name in (apple as 'apple', cabbage as 'cabbage', chicken as 'chicken'));

It turns my table from this:

category	subcategory	name	value	rating
fruit	red	apple	4g	4/10
vegetable	green	cabbage	4g	3/10
meat	white	chicken	49g	8/10
chemical	black	cocacola	12g	10/10
chemical	blue	pepsi	12g	10/10

to this:

name	value	rating
apple	4g	4/10
cabbage	4g	3/10
chicken	49g	8/10

The subquery basically returns something like this (before the unpivoting):

value	rating	apple	cabbage	chicken
4g	4/10	3*	8	7
4g	3/10	8	4*	6
49g	8/10	5	9	4*
12g	10/10	7*	12	10
12g	10/10	11	7	6*

I match the most similar strings to what I have in the table.
If there is more than one least per variable given, pick the one with the least distance (for example chicken over pepsi and apple over cocacola).
I unpivot and partition by order so I can find the most similar and eliminate based on that (remove pepsi and cocacola).

I managed to do the following to only get the subquery, but I need to add the outer query now:

public Stream<Projection> stream(
    Stream<Pair<String, Stream<String>>> similarStringGroups,
    Class<Projection> projectionClass,
    Class<T> entityClass,
    String termField
) {
    CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
    CriteriaQuery<Projection> query = criteriaBuilder.createQuery(projectionClass);
    Root<T> root                    = query.from(entityClass);
    RecordComponent[] recordComponents = projectionClass.getRecordComponents();

    Stream<Expression<Integer>> leastExpressions = similarStringGroups.map(
        similarStringGroup ->
            criteriaBuilder.function(
                "LEAST",
                Integer.class,
                similarStringGroup.getSecond().map(
                    similarString -> criteriaBuilder.function(
                        "UTL_MATCH.EDIT_DISTANCE",
                        Integer.class,
                        criteriaBuilder.literal(similarString.toUpperCase()),
                        criteriaBuilder.upper(root.get(termField))
                    )
                ).toArray(Expression[]::new)
            )
    );

    return entityManager.createQuery(criteriaQuery.multiselect(Stream.concat(leastExpressions, Stream.of(root.get(recordComponents[4].getName()), root.get(recordComponents[5].getName()))).toArray(Selection[]::new)))
}

Just to add to the above method that I wrote:

entityClass is just the Food.class.
termField is name in the SQL query.
projectionClass is temporarily set to public record TemroraryRecord(Integer apple, Integer cabbage, Integer chicken, String value, String rating) but it will eventually be public record NameValueRating(String name, String value, String rating) for the outer query. I can't have a class for the subquery because the number of result columns is variable.
similarStringGroups is basically the following:

Stream.of(
    Pair.of("apple",   Stream.of("pineapple", "yellowapple", "greenapple")),
    Pair.of("cabbage", Stream.of("carbage",   "cabbagge",    "bagge")),
    Pair.of("chicken", Stream.of("chakn",     "chkn",        "chikee"))
)

How to build query with CriteriaBuilder for a query with a subquery that returns dynamic number of columns?

0 Answers0