1

Consider a database schema of the following:

  • An Entity table.
  • A SpecializedEntity table, which has exactly 0 or 1 record for each record in Entity, and a nullable LookupValueId column.
  • A LookupValue table, which has a record for each possible LookupValueId.

Example database schema

CubeJS relationships are defined as follows:

  • Entity defines a hasOne relationship to SpecializedEntity, and SpecializedEntity defines a belongsTo relationship to Entity
  • SpecializedEntity defines a belongsTo relationship to LookupValue, and LookValue defines a hasMany relationship to SpecializedEntity.

Finally, a count measure is defined on Entity.

When attempting to Count the number of entities by LookupValueDescription, an inflated/incorrect count (>100x higher) of NULL values is returned (counts for all other LookupValueDescriptions is correct), due to the LEFT JOIN generated by CUBEJS for SpecializedEntity.

SELECT
  TOP 10000 "lookup_value".LookupValueDescription "lookup_value__lookupValueDescription",
  count("entity".EntityId) "entity__count"
FROM
  Entity AS "entity"
  -- THIS NEXT LEFT JOIN BELOW RESULTS IN INCORRECT NULL COUNT
  LEFT JOIN SpecializedEntity AS "specialized_entity" ON "speciailized_entity".EntityId = "entity".EntityId
  LEFT JOIN LookupValue AS "lookup_value" ON "lookup_value".LookupValueId = "specialized_entity".LookupValueId
GROUP BY
  "lookup_value".LookupValueDescription

If the LEFT JOIN on SpecializedEntity is replaced with an INNER JOIN, the correct count for NULL values is returned. Of course, cubejs does not support INNER JOINs.

Why this NULL count is invalid when counts for all other values are correct?

Can the relationships between the three tables/cubes be adjusted to return a correct NULL count?

Andrew Keller
  • 3,198
  • 5
  • 36
  • 51

0 Answers0