Using Oracle SQL, I’m trying to calculate total unique visits to a website. The table I’m using to write the query does not have a timestamp which includes minutes and seconds just DDMMYY and every row in the table represents a customer click on the page. The table designates a new “session” every hour, regardless of whether that actually reflects a new visit from the customer’s POV. What I must do is use non-consecutive sessions as a proxy for unique visits. So, if there is an hour break between visits the previous consecutive grouping is one visit. I define a visit as a unique combination of customer ID + session day + session hour. If there are consecutive session hours within a customer + day combination, I count that as a single session. The HOUR filed contains string values that concatenate date with hour. In order to do the appropriate visit count calculation, I will need to parse out the hour and subtract from the previous (lag) row in order to determine if there is greater than an hour “break”.
Example of Raw Data:
TRANS_TO_DATE CUSTOMER_ID HOUR
10/21/17 1007589445 October 21, 2017, Hour 1
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 3
10/21/17 1007589445 October 21, 2017, Hour 5
10/21/17 1007589445 October 21, 2017, Hour 6
10/21/17 1007589445 October 21, 2017, Hour 23
10/21/17 1007589445 October 21, 2017, Hour 23
10/21/17 1007589445 October 21, 2017, Hour 23
11/1/17 1007589445 November 1, 2017, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 11
1/1/18 1007589445 January 1, 2018, Hour 14
1/1/18 1007589445 January 1, 2018, Hour 20
1/1/18 1007589445 January 1, 2018, Hour 22
The visit count is actually this:
Customer_id Day Hour Visit Grouping
1007589445 October 21, 2017 1 Visit 1
1007589445 October 21, 2017 2 Visit 1
1007589445 October 21, 2017 3 Visit 1
1007589445 October 21, 2017 5 Visit 2
1007589445 October 21, 2017 6 Visit 2
1007589445 October 21, 2017 23 Visit 3
1007589445 November 1, 2017 10 Visit 1
1007589445 January 1, 2018 10 Visit 1
1007589445 January 1, 2018 11 Visit 1
1007589445 January 1, 2018 14 Visit 2
1007589445 January 1, 2018 20 Visit 3
1007589445 January 1, 2018 21 Visit 4
Customer 1007589445 had
3 visits on October 21, 2017 - 1 visit on November 1, 2017 - 4 visits on January 1, 2018
Total visits: 8
Below is the sql code I have so far which needs to be modifide to satisfy the critera above.
select
CUSTOMER_ID,
TRANS_TO_DATE,
HOUR,
count (HOUR) as visits
from mstr_clickstream_vw
where trans_to_date between start_date and end_date
and web_store_ind='US'
group by CUSTOMER_ID, TRANS_TO_DATE,HOUR