Questions tagged [window-functions]

A window function is a type of SQL operation that aggregates over a partition of the result set.

As a basic example of a window function, consider a table that contains a list persons along with their age and gender.
If you wanted a query to return the complete list of people along with their age rank per gender, you could accomplish this most simply by using a ranking window function.
Such a function can be used to rank individual rows within a subset of the results.

Using a window function the above function would look like this:

SELECT name, 
       age, 
       gender, 
       RANK() OVER (PARTITION BY gender ORDER BY age DESC) AS age_rank
FROM persons

Window functions as defined by the SQL standard can also access rows before and after the current row using the lag() or lead() function. The following query calculates the delta to the previous row for the amount column for each customer:

SELECT customer_id,
       order_date,
       amount, 
       amount - lag(amount) over (partition by customer_id ORDER BY order_date) as delta
FROM order
ORDER BY customer_id, order_date;

Standard aggregates (e.g. count(), sum()) can also be used as a window function. If combined with an ORDER BY (inside the partition defintion) this will generate "running" totals based on the aggregate being used.

Window functions are available in a wide range of DBMS: Oracle (called "analytical functions" there), Microsoft SQL Server, DB2 (called "olap functions" there) PostgreSQL, Teradata, Sybase, Vertica, and partially in H2 Database.

4082 questions
1
vote
1 answer

Unexpected frame structure in pyspark window function

I am trying to understand how a frame is constructed when using a window function and seeing some unexpected behavior. This is the code (can be run as is): from pyspark.sql import SparkSession from pyspark.sql.types import StructField, StructType,…
Vitaliy
  • 8,044
  • 7
  • 38
  • 66
1
vote
1 answer

Get percentage of all rows in a similar grouping

In PostgreSQL, how can I get the average of all other rows (e.g., "val") in the same group (e.g., "retailer") while still outputting all rows? For example, below, I want to get the percentage of the val column per row compared to all other rows with…
user3871
  • 12,432
  • 33
  • 128
  • 268
1
vote
4 answers

Min() and Max() of multiple attributes in a partition window on SQL Server

I have a timetable in SQL Server that has the [SERV_ID] (service-id), [STATION] (station), [ARR] (arrivaltime), [DEP] (departuretime) of a public transport vehicle. Every Service can be present every day [SERV_DAY]. Target is to summarize…
Sam Amani
  • 85
  • 7
1
vote
3 answers

Oracle SQL data migration row to column based in month

CODE1 CODE2 CODE3 RATE VALUE MONTH A B C 1 1 202001 A B C 1 1 202002 A B C 1 1 202003 A B C 2 1 202004 A B …
Abdul
  • 942
  • 2
  • 14
  • 32
1
vote
2 answers

Redshift count consecutive days based on a condition and break(stop counting) when the condition doesn't

Table Date Region_1 Region_2 27-Sep 100 97.69 28-Sep 53.84 98.21 29-Sep 88.88 10 30-Sep 8.33 96 01-Oct 25 97.63 02-Oct 20 94.82 03-Oct 25 100 04-Oct 12.5 60.86 05-Oct 10 67.29 06-Oct 42.85 …
KKP
  • 75
  • 6
1
vote
1 answer

Get running count and sums of value of all rows that are started but not yet finished for each second

I have some event data that looks like this: | time | id | status | value | |-------------------------|----|----------|-------| | 2020-08-26T21:29:01.000 | 2 | started | 8 | | 2020-08-26T21:29:01.000 | 3 | started | 4 …
ydaetskcoR
  • 53,225
  • 8
  • 158
  • 177
1
vote
2 answers

How can I fill in/interpolate mising timestamps in an SQLite query?

I have a bunch of GPS tracks recorded in a table. Due to a bug in the GPS logging code, the fractional part of each timestamp is incorrect (and repeated). I would like to create a query that finds the first and last record with the same timestamp,…
Isaac
  • 105
  • 1
  • 6
1
vote
1 answer

Calculate running difference between two columns in Postgresql

I have data in two tables as below Table 1 : Material |Ordr Qty ---------|--------- abcd |4253 Table 2 : Material | Stck Qty ---------|--------- abcd |1000 abcd |2000 abcd |2000 Expected Output : Material |Ordr Qty |Stck Qty…
Varma
  • 23
  • 4
1
vote
1 answer

How to find duplicate records in MySQL, but with a degree of variance?

Assume I have the following table structure and data: +------------------+-------------------------+--------+ | transaction_date | transaction_description | amount | +------------------+-------------------------+--------+ | 2020-08-20 | Burger…
1
vote
2 answers

Adjusting the end dates

I wanted to manipulate the end date to next record startdate or records that happened on same day for shopID and then grab the starttime of later record and update the value in the prior row endtime if the starttime prior to next row start time and…
John
  • 47
  • 2
1
vote
1 answer

Cummulative sum by year_month, location, state in Hive SQL

I want to make a cumulative count using Hive SQL in recorrencia column according to the other ones. +------------+---------+-------+--------------+--+ | t.ano_mes | t.site | t.uf | recorrencia …
Henrique Branco
  • 1,778
  • 1
  • 13
  • 40
1
vote
3 answers

SQL query grouping by range

Hi have a table A with the following data: +------+-------+----+--------+ | YEAR | MONTH | PA | AMOUNT | +------+-------+----+--------+ | 2020 | 1 | N | 100 | +------+-------+----+--------+ | 2020 | 2 | N | 100 …
Elvis D.
  • 73
  • 1
  • 10
1
vote
1 answer

Convert Dense_Rank Oracle to PostgresSQL

I have a select statement that contains a dense_rank function written in Oracle but I'm having trouble figuring out how to make it compatible for postgressql (v11.x) SELECT facility_id_fk, max(inspection_date) as last_inspection, …
1
vote
4 answers

Postgresql select until certain total amount is reached and lock

I have a table of user batches. I only want to select until my amount total reaches a certain amount. id | user_id | balance | batch_id ----|---------|-------- |-------- 1 | 1 | 2 | 1 2 | 2 | 15 | 2 3 | 1 | 8 …
1
vote
1 answer

How to group each section of payments by the min date for each section the payment was made with windows functions

I have table that stores payment changed records. So each time the payment method is changed the payment used and date is stored. The data comes in bulk, but i only grab the first date the new payment was used. CREATE TABLE #payments ( pay_ID…