1

I'm using BigQuery and DataStudio to show retention plotted throughout the weeks and i'm having trouble with that when a user is new and then uses the app again the same week he's both new and retained in my calculations and I would like it to be so that he's only new the first week of using the app and then if he uses the app again within 2 weeks, he's "Retained".

This is my query:

SELECT
UserID,
DATE,
DATE_DIFF(DATE,PreviousSessionDATE, DAY) as DaysBetweenSessions,
(SELECT
 CASE
WHEN DaysBetweenSessions <= 14 THEN 'Retained'
WHEN DaysBetweenSessions >14 THEN 'Returned'
WHEN DaysBetweenSessions IS NULL AND FirstSessionDATE = DATE THEN 'New'
WHEN DaysBetweenSessions IS NULL THEN 'User has an old version without Retention Parameters'
END) as User_Type
FROM
app_project.analytics_*********.events_*
GROUP BY
1,2,3,4
ORDER BY
DATE DESC,
DaysBetweenSessions DESC,
1,2,3,4

Then the result will generally be good except for when a user uses the app several times and get a between 1-14 as DaysBetweenSessions and then be counted as both new and retained within the same week.

Then in DataStudio I will do YEARWEEK(DATE) to vizualise per week and Count_Distinct(UserID) as my metrics.

Any ideas on how i can sort out so that the new user is only counted as new in the first week even if the user trains within that week?

Current output in BQ:
UserID     DATE        DaysBetweenSessions     User_Type
123        20180801    NULL                    "New"
123        20180801    0                       "Retained"

And desired output

UserID     DATE        DaysBetweenSessions     User_Type
123        20180801    NULL                    "New"
PhilipEldh
  • 25
  • 6
  • Do you still want separate lines with both being "New" instead of "New","Retained",... or do you want a single line with just "New", retaining that first session? An example of the current and the desired output would be useful. – Bobbylank Aug 02 '18 at 09:35
  • @Bobbylank So the desired output is to only count the new user as new during the 7 days duration, in the same column as New, Retained, Returning. example: DATE:20180101 type: "New" id: "123", and also then DATE:20180102 type:"Retained" id:"123" desired output is the first row only – PhilipEldh Aug 02 '18 at 10:00
  • I edited that for the 'first week' aspect but wasn't sure if you meant 7-days after new visit or a calendar week. If it's a calendar week then change that 'DaysBetween' part to the week's start date – Bobbylank Aug 02 '18 at 10:56
  • @Bobbylank This solved it! Thanks! – PhilipEldh Aug 03 '18 at 07:57

1 Answers1

2

There might be a neater way to do this but...

WITH CTE AS
(SELECT
UserID,
DATE,
DATE_DIFF(DATE,PreviousSessionDATE, DAY) as DaysBetweenSessions,
(SELECT
CASE
WHEN DaysBetweenSessions <= 14 THEN 'Retained'
WHEN DaysBetweenSessions >14 THEN 'Returned'
WHEN DaysBetweenSessions IS NULL AND FirstSessionDATE = DATE THEN 'New'
WHEN DaysBetweenSessions IS NULL THEN 'User has an old version without Retention 
Parameters'
END) as User_Type,
(SELECT
CASE
WHEN DaysBetweenSessions <= 7 THEN 0
WHEN DaysBetweenSessions >7 THEN 1
WHEN DaysBetweenSessions IS NULL AND FirstSessionDATE = DATE THEN 0
WHEN DaysBetweenSessions IS NULL THEN 2
END) as DaysBetween
FROM
app_project.analytics_*********.events_*
GROUP BY
1,2,3,4
ORDER BY
DATE DESC,
DaysBetweenSessions DESC,
1,2,3,4),

Result as 
(SELECT *, min(User_Type) OVER (PARTITION BY UserID, DaysBetween) minUser_Type
FROM CTE)

SELECT UserID,
DATE,
DaysBetweenSessions,
User_type 
FROM Result 
WHERE NOT (User_Type <> 'New' AND minUser_Type = 'New')

The second part should add a dimension which is the alphabetically lowest User_Type in that week (so if you renamed anything to alphabetically less than 'New' it wouldn't work, better using numbers ideally).

The last part should get rid of the ones where there was a 'New' in that week but the row User_Type isn't 'New'.

Bobbylank
  • 1,906
  • 7
  • 15