I currently have a table in BigQuery that contains some outliers
Example table:
port - qty - datetime
--------------------------------
TCP1 - 13 - 2018/06/11 11:20:23
UDP2 - 15 - 2018/06/11 11:24:24
TCP3 - 14 - 2018/06/11 11:24:27
TCP1 - 2 - 2018/06/11 11:24:26
UDP2 - 15 - 2018/06/11 11:35:32
TCP3 - 13 - 2018/06/11 11:45:23
TCP3 - 14 - 2018/06/11 11:54:22
TCP3 - 30 - 2018/06/11 11:55:33
I would like to be able to sift out the outliers on the various ports at 2018/06/11 using SQL and standard deviation
Result:
TCP1 - 2 - 2018/06/11 11:24:26
TCP3 - 30 - 2018/06/11 11:55:33
I have done some research and found that standard deviation is able to help sift out the outliers. However, i do not know how to write the SQL query to make this work. Any help will be greatly appreciated.
(this is the closest thread that i could find on this topic: Using BigQuery to find outliers with standard deviation results combined with WHERE clause)