How to modify Cube.js schema to fix "column must appear in the GROUP BY clause or be used in an aggregate function" error?

Question

I encounter an issue using Cube.js with avg measures. This is an example of two cubes, "Posts" and "Comments", with the following schema:

const schemaName = 'my_schema'

cube('Posts', {
    sql: `SELECT * FROM ${schemaName}.posts`,
    joins: {
        Comments: {
            relationship: 'hasMany',
            sql: `${CUBE}.id = ${Comments}.post_id`,
        },
    },
    measures: {
        count: {
            sql: 'id',
            type: 'count',
        },
        avgCommentsCount: {
            sql: `${Comments.count} / ${CUBE.count}`,
            type: 'avg',
        },
    },
    dimensions: {
        id: {
            sql: 'id',
            type: 'string',
            shown: true,
            primaryKey: true,
        },
    },
})

cube('Comments', {
    sql: `SELECT * FROM ${schemaName}.comments`,
    joins: {
        Posts: {
            relationship: 'belongsTo',
            sql: `${CUBE}.post_id = ${Posts}.id`,
        },
    },
    measures: {
        count: {
            sql: 'id',
            type: 'count',
        },
    },
    dimensions: {
        id: {
            sql: 'id',
            type: 'string',
            shown: true,
            primaryKey: true,
        },
        type: {
            sql: 'type',
            type: 'string',
        },
    },
})

I'm trying to run a query with the following JSON:

{
  "dimensions": [
    "Comments.type"
  ],
  "order": {
    "Posts.avgCommentsCount": "desc"
  },
  "measures": [
    "Posts.avgCommentsCount"
  ]
}

However, when Cube.js generates the SQL for this query, it throws the following error:

column "q_0.comments__type" must appear in the GROUP BY clause or be used in an aggregate function

I understand that this error occurs because the top-level SELECT statement is missing a GROUP BY clause for the "Comments.type" dimension. I'm not sure how to modify my Cube.js schema to fix this issue.

Can anyone suggest how to modify the schema to fix this error?

Edit: In addition, here is the generated SQL with the missing GROUP BY in comment:

SELECT
  q_0."comments__type",
  avg(
    "comments__count" / "posts__count"
  ) "posts__avg_powers_count"
FROM
  (
    SELECT
      "main__comments".type "comments__type",
      count("main__comments".id) "comments__count"
    FROM
      my_schema.users AS "main__posts"
      LEFT JOIN my_schema.powers AS "main__comments" ON "main__posts".id = "main__comments".userid
    GROUP BY
      1
  ) as q_0
  INNER JOIN (
    SELECT
      "keys"."comments__type",
      count(
        "posts_key__posts".id
      ) "posts__count"
    FROM
      (
        SELECT
          DISTINCT "posts_key__comments".type "comments__type",
          "posts_key__posts".id "posts__id"
        FROM
          my_schema.users AS "posts_key__posts"
          LEFT JOIN my_schema.powers AS "posts_key__comments" ON "posts_key__posts".id = "posts_key__comments".userid
      ) AS "keys"
      LEFT JOIN my_schema.users AS "posts_key__posts" ON "keys"."posts__id" = "posts_key__posts".id
    GROUP BY
      1
  ) as q_1 ON (
    q_0."comments__type" = q_1."comments__type"
    OR (
      q_0."comments__type" IS NULL
      AND q_1."comments__type" IS NULL
    )
  )
-- The missing GROUP BY:
-- GROUP BY
--  1
ORDER BY
  2 DESC
LIMIT
  10000

I have resolved my issue by adding a dimension of the `Comments.count` measure in the `Posts` cube with a `subQuery` set to "true", but I'm still interested in an other way to do that. Because If I have to avg measures of the `Posts` cube it will not be possible to use subQueried dimension. — ecavard, Mar 09 '23 at 16:17

How to modify Cube.js schema to fix "column must appear in the GROUP BY clause or be used in an aggregate function" error?

0 Answers0