SITUATION
I am working on a travel engine site and am writing a complex query to match visitors' search queries with their bookings based on IP Address, Destination and Date so I can work out the conversion ratio later.
PROBLEM
There needs to be multiple conversion ratios based on a parameter (In this case, the utm_source which I extract from a RequestUrl stored in the searches table). The problem is some users make multiple searches from different locations.. sometimes we get the utm_source in the request and sometimes not... and of course we need to match to only 1 booking. See screenshot of query result below to better understand:
See the 3rd and 4th rows have the same booking ID, etc.. but different values for the Value column. I need to select only 1 of these, but not both. Basically, if there is more than 1, I need to choose the 1 that is not "N/A".
MY QUERY:
SELECT DISTINCT "B"."Id" AS "BookingId", "PQ"."IPAddress", "PQ"."To", "PQ"."SearchDate", "PQ"."Value"
FROM
(
SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value"
FROM dbo."PackageQueries"
WHERE "SiteId" = '<The ID>'
AND "CreatedAt" >= '<Start Date>'
AND "CreatedAt" < '<End Date>'
) AS "PQ"
INNER JOIN dbo."Bookings" AS "B"
ON "PQ"."IPAddress" = "B"."IPAddress"
AND "B"."To" = "PQ"."To"
AND "B"."BookingDate"::date = "PQ"."SearchDate"
WHERE "B"."SiteId" = '<The ID>'
AND "B"."BookingStatus" = 2
AND "B"."BookingDate" >= '<Start Date>'
AND "B"."BookingDate" < '<End Date>'
ORDER BY "B"."Id", "PQ"."IPAddress", "PQ"."To";