We are having trouble with the preaggregations creation performance. We currently have specific filters for the data for each one of our clients, and we generate different cubes for each one of them by extending a base cube (called Metrics
) and defining a segment that represents those filters.
To summarize, we have a Metrics
base cube, and we generate dynamic cubes MetricsA, MetricsB, MetricsC
for clients A, B, C
. Each one of these cubes has a segment that we call z
, which contains a specific SQL query for each of our clients. The data to build the segment is retrieved from our API using asyncModule
, and then we extend the Metrics
cube to generate all the clients specific cubes by overriding the z
segment with the client's filter
.
By doing this, when a client queries the cube service, the data retrieved will come from their specific cube, with the data already filtered (by the enforced z
segment).
This Metrics cube is built by joining large tables, so we also added a partitionGranularity
(monthly) to reduce the size of the preaggregations, but they are still taking too long to build (> 10 minutes).
We need to edit the specific query that the cube service submits to create the preaggregation tables, so we only keep the rows with the z
segment = 1 (because that is the relevant data), or at least we want to be able to rearrange/modify the query to improve performance. Which is the best place to do such changes? or what is the recommended practice to intervene this process?