Questions tagged [window-functions]

A window function is a type of SQL operation that aggregates over a partition of the result set.

As a basic example of a window function, consider a table that contains a list persons along with their age and gender.
If you wanted a query to return the complete list of people along with their age rank per gender, you could accomplish this most simply by using a ranking window function.
Such a function can be used to rank individual rows within a subset of the results.

Using a window function the above function would look like this:

SELECT name, 
       age, 
       gender, 
       RANK() OVER (PARTITION BY gender ORDER BY age DESC) AS age_rank
FROM persons

Window functions as defined by the SQL standard can also access rows before and after the current row using the lag() or lead() function. The following query calculates the delta to the previous row for the amount column for each customer:

SELECT customer_id,
       order_date,
       amount, 
       amount - lag(amount) over (partition by customer_id ORDER BY order_date) as delta
FROM order
ORDER BY customer_id, order_date;

Standard aggregates (e.g. count(), sum()) can also be used as a window function. If combined with an ORDER BY (inside the partition defintion) this will generate "running" totals based on the aggregate being used.

Window functions are available in a wide range of DBMS: Oracle (called "analytical functions" there), Microsoft SQL Server, DB2 (called "olap functions" there) PostgreSQL, Teradata, Sybase, Vertica, and partially in H2 Database.

4082 questions
1
vote
2 answers

Add value of type B to type A of same org_id

I have a table like this ORG_ID TYPE VALUE 1 A 20 1 B 30 1 C 10 2 B 60 Now I want to select values from this table for each org_id, type but for type A i want to add value of type B of same org_id to…
Sanguis
  • 113
  • 7
1
vote
1 answer

Select first in and last out time - different date and null condition - from data finger

Here is my data finger table [dbo].[tFPLog] CardID Date Time TransactionCode 100 2020-09-01 08:00 IN 100 2020-09-01 17:00 OUT 100 2020-09-01 17:10 OUT 200 2020-09-02 02:00 OUT 200 2020-09-02 02:15 OUT 100 …
Jml
  • 79
  • 1
  • 8
1
vote
2 answers

Assign a certain value to all other values if at least one criteria per value is met

DB-Fiddle CREATE TABLE operations ( id int auto_increment primary key, campaign VARCHAR(255), country VARCHAR(255), sales_status VARCHAR(255), quantity INT ); INSERT INTO operations (campaign, country, sales_status,…
Michi
  • 4,663
  • 6
  • 33
  • 83
1
vote
2 answers

Percentile in Case Statement In Athena SQL

Trying to execute this in Athena SQL. I have '75' as default tax_type which i need to replace with '76' for 20% of tax_amt Current output as is geocode tax_type tax_amt 32 75 10 32 75 15 32 75 20 32 …
Harry
  • 11
  • 4
1
vote
2 answers

select elements from jsonb postgres array

i have this jsonb in my database. I need to show the element that is in state 1 if and only if the previous element is in state 3 I need to make the query that brings me the element that meets the condition. is posible with postgres ? [ { …
Dario Paez
  • 41
  • 4
1
vote
1 answer

Getting latest price of different products from control table

I have a control table, where Prices with Item number are tracked date wise. id ItemNo Price Date --------------------------- 1 a001 100 1/1/2003 2 a001 105 1/2/2003 3 a001 110 1/3/2003 4 b100 50 1/1/2003 5 b100…
1
vote
3 answers

Query to assign serial number for rows without grouping together and without changing the order of rows

This table consists of the truck trips that carries containers to another location. Each time the truck carries a container, the records are stored. If it carries 1 unit, then I've store "1 Unit" under Unitcount Field and also assigned TripCount as…
1
vote
1 answer

Add new column with prior 7 transactions in SQL

For customer segmentation, I need to analysis how many transactions did the customer do in prior 7 days, prior 14 days etc. basis a given transaction date. Copied the input and required output tables I have tried using this but it seems the logic is…
AMP
  • 11
  • 1
1
vote
1 answer

Pyspark StandardScaler over a Window

I want to use the standardscaler pyspark.ml.feature.StandardScaler over window of my data. df4=spark.createDataFrame( [ (1,1, 'X', 'a'), (2,1, 'X', 'a'), (3,9, 'X', 'b'), (5,1, 'X', 'b'), (6,2, 'X', 'c'), …
1
vote
1 answer

how to update column with row number in group by result in MySQL

Sample Data: id | room_id | seat_num ---------------------------------- 1 | 1 | null 2 | 1 | null 3 | 2 | null 4 | 2 | null Desire Data: id | room_id | seat_num ---------------------------------- 1 …
Dolphin
  • 29,069
  • 61
  • 260
  • 539
1
vote
2 answers

MySQL find rows where yesterday's value is > X AND where last 5 days value < X exists

Let's say I have the following table: date | name | value ---------------------------- 2020-09-01 | name1 | 10 2020-09-02 | name1 | 9 2020-09-03 | name1 | 12 2020-09-04 | name1 | 11 2020-09-05 | name1 | 11 I would like to identify names…
Rich
  • 7
  • 2
1
vote
1 answer

Is it possible to rank a dataset by the result of another window function in T-SQL?

Is there any way to rank a dataset by the result of another window function? For example, I have a query like this one below: select distinct country, cast(sum(Sessions) over (partition by country) as float) / cast(sum(sessions) over() as…
1
vote
1 answer

How to find whether the same customers who ordered this month also ordered the next month?

I have an orders table Order_id User_id Order_date 1 32 2020-07-19 2 24 2020-07-21 3 27 2020-07-27 4 24 2020-08-14 5 32 2020-08-18 6 32 2020-08-19 7 …
Praneeth Vasarla
  • 113
  • 1
  • 1
  • 9
1
vote
2 answers

Group by 1 minute interval for the chain of actions sql BigQuery

I need to group the data with 1 minute interval for the chain of actions. My data looks like this: id MetroId Time ActionName refererurl 111 a 2020-09-01-09:19:00 First …
1
vote
3 answers

SQL lookup value per closest earlier date in other table

Re the solution to this question, wouldn't the correlated subquery be executed many times? Can you guys think of any other query that would perform better on really big datasets? Rather than using joins with corralated subquery, maybe by leveraging…