Look at my next Vertica query:
WITH groupA AS (
SELECT
userId,
color
FROM worldUsers
WHERE DayPartition = 409
AND Hour = 12
)
SELECT
color = 12 AS userColor,
REGEXP_LIKE(familyName, '[a-z]+ish', 'i') ishPeople,
REGEXP_LIKE(familyName, '[a-z]+lem', 'i') lemPeople,
count(*) AS peopleCount,
round(count(*) / sum(count(*))
OVER (), 5.0) * 100 || '%' AS peoplePercentage
FROM PlanetUsers
JOIN groupA USING (userId)
WHERE DayPartition = 409
GROUP BY 1, 2, 3
ORDER BY 4 DESC
Both the tables worldUsers and PlanetUsers got the field Hour
and they both partitioned by DayPartition
field. Will it improve the performance if we change the where statement of the below query
from
WHERE DayPartition = 409
To
WHERE DayPartition = 409 and AND Hour = 12
Or change the using statement
From
USING (userId)
To
USING (userId, Hour)
Assume that except the partition on DayPartition
there are no primary keys and no projections