-1

Below I have the tables and query which output the below

Table1

EmployeeID | StartDateTimestamp | CohortID | CohortName
---------- | ------------------ | -------- | ----------
1          | 20080101 01:30:00  | 1        | Peanut
1          | 20090204 01:01:00  | 2        | Apple
2          | 20190107 05:52:14  | 1        | Peanut
3          | 20190311 02:35:26  | 2        | Apple

Employee

EmployeeID | HireStartName | StartDateTimestamp2
---------- | ------------- | -------------------
1          | HiredStart    | 20080501 01:30:00
1          | DeferredStart | 20090604 01:01:00
2          | HiredStart    | 20190115 05:52:14
3          | HiredStart    | 20190330 02:35:26

Query

select 
    t.cohortid, 
    min(e.startdatetimestamp2) first,  
    max(e.startdatetimestamp2) last

from table1 t
inner join employee e on e.employeeid = t.employeeid
group by t.cohort_id

Output

ID | FIRST            | LAST
1  |20190106 12:00:05 |20180214 03:45:12
2  |20180230 01:45:23 |20180315 01:45:23

My attempt:

SELECT DATE_DIFF(first, last, Day), ID, max(datecolumn1) first, min(datecolumn1) last

Error: Unrecognized name. How do I enter the reference alias first and last in a Date_Diff? Do I need to derive a table?

Clarity: Trying to avoid inputting in the dates, since I am looking to find the date diff of both first and last columns for as many rows as there is data.

This answer has been discussed here: Date Difference between consecutive rows

DateDiff has deprecated, and now it is Date_Diff (first, last, day)

Then I tried:

SELECT ID, DATE_DIFF(PARSE_DATE('%y%m%d',t.first), PARSE_DATE('%y%m%d',t.last), DAY) days
FROM table

Failed to parse input string "20180125 01:00:05"

Tried this

SELECT CohortID, date_diff(first,last,day) as days
FROM (select cohortid,min(startdatetimestamp2) first, 
max(startdatetimestamp2) last 
FROM employee 
JOIN table1 on table1.employeeid = employee.employeeid 
group by cohortid)

I get days not found on either side of join

1 Answers1

0

Regarding your first question about using aliases in a query, there are some restriction where to use them, specially in the FROM, GROUP BY and ORDER BY statements. I encourage you to have a look here to check these restrictions.

About your main issue, obtaining the date difference between two dates. I would like to point that your timestamp data, in both of your tables, are actually considered as DATETIME format in BigQuery. Therefore, you should use DATETIME builtin functions to get the desired results.

The below query uses the data you provided to obtain the aimed output.

WITH
  data AS 
  (
  SELECT
    t.cohortid AS ID,
    PARSE_DATETIME('%Y%m%d %H:%M:%S', MIN(e.startdatetimestamp2)) AS first,
    PARSE_DATETIME('%Y%m%d %H:%M:%S', MAX(e.startdatetimestamp2)) AS last
  FROM
    `test-proj-261014.sample.table1` t
  INNER JOIN
    `test-proj-261014.sample.employee` e
  ON
    e.employeeid = t.employeeid
  GROUP BY t.cohortid 
  )
SELECT
  ID,
  first,
  last,
  DATETIME_DIFF(last, first, DAY ) AS diff_days
FROM
  data

And the output:

enter image description here

Notice that I created a temp table to format the fields StartDateTimestamp and StartDateTimestamp2, using the PARSE_DATETIME(). Afterwards, I used the DATETIME_DIFF() method to obtain the difference in days between the two fields.

halfer
  • 19,824
  • 17
  • 99
  • 186
Alexandre Moraes
  • 3,892
  • 1
  • 6
  • 13