Questions tagged [partition-by]

266 questions
1
vote
2 answers

Displaying Unique Temperature Values for the US

I am looking to do something similar to what I did in R (below) in SQL: clim_data %>% select(Year, AverageTemperature, State) %>% group_by(Year,State) %>% summarize(value = mean(AverageTemperature), .groups = 'drop') ->…
mitchlarue
  • 13
  • 2
1
vote
1 answer

Partition of Bill no in last and previous by customer id based on date in SQL

I am trying to part bill no in latest and previous based on date by customer id. this is the data : this data contains bill number along with its start and end date by customer output should be like this : output should be into latest and previous…
zoro
  • 11
  • 2
1
vote
1 answer

Hive(3) function row_number() over(partition by...) issue

Trying to rank items from a list by column nb_pde descending so : DROP TABLE IF EXISTS temp_2; CREATE TEMPORARY TABLE temp_2 AS SELECT code_pde, nb_pde,row_number() OVER(PARTITION BY code_pde ORDER BY nb_pde DESC) AS rank FROM …
Marc Cabes
  • 11
  • 3
1
vote
1 answer

Do i really need HDFS for pyspark Partitionby()

while ruining this spark_df.write.option("header",True).partitionBy("variant").mode("overwrite") .csv("/tmp/covid") its throwing me error Py4JJavaError Traceback (most recent call…
1
vote
1 answer

How to Filter Data in SQL Without Using Group By / Having Functions

Have a situation where I'm trying to produce a query that shows the item # on a sales order and the total outstanding quantity across all orders. I can do this as follows: SELECT SL.[Item No_], SUM(SL.[Outstanding Quantity]) FROM [Database$Sales…
adhocEY
  • 75
  • 6
1
vote
2 answers

Why is a CTE needed to execute the following

The following query errors with an invalid column name 'rowid' SELECT row_number() over (partition by sales_rep order by timestamp desc) as rowid, * FROM dbo.you_gettheidea where rowid = 1 However, the following version works perfectly. I'm not…
ZdWhite
  • 501
  • 1
  • 3
  • 15
1
vote
1 answer

Finding adjacent column values from the last non-null value of a certain column in Snowflake (SQL) using partition by

Say I have the following table: ID T R 1 2 1 3 Y 1 4 1 5 1 6 Y 1 7 I would like to add a column which equals the value from column T based on the last non-null value from column R. This means the…
1
vote
1 answer

SQL - Average x per y, per z

I have a data set of two columns, an ID and a datetime I am trying to get an average count of hours per ID each month. So far I have managed to use this code to get me the count of hours for an individual day: SELECT id, DATE(created), …
Alastair
  • 63
  • 8
1
vote
1 answer

SQL Partition by with conditions

I want to partition the data on the basis of two columns Type and Env and fetch the top 5 records for each partition order by count desc. The problem that I'm facing is that I need to partition the Env on the basis of LIKE condition. Data…
mseth
  • 31
  • 4
1
vote
1 answer

Get consecutive count on repeated values with complex partition by in Postgres

Given a set of 3 columns (blue columns: student, semester, grade) I need to calculate for how many consecutive semesters has each student had the same grade (green column) using a query in Postgres. (first consecutive semester with avg 8: consec =…
eddh
  • 13
  • 4
1
vote
1 answer

postgreSQL, first date when cummulative sum reaches mark

I have the following sample table And the output should be the first date (for each id) when cum_rev reaches the 100 mark. I tried the following, because I taught with group bz trick and the where condition i will only get the first occurrence of…
pooq
  • 141
  • 7
1
vote
3 answers

SQL to get 3 adjacent actions without duplicate from the flags

I have a question that a little bit similar with question#66044663 but more complicated. Here's my dummy data. I want to get 3 adjacent actions(no duplicate) from the flag by each user. Here's the chart to describe my thought. Here's what I…
LucasLee
  • 75
  • 6
1
vote
0 answers

Conditional average in analytical function Big Query

By using window functions in Big Query, I want to calculate the average of a column based on some conditions. I use a window function in stead of a join because my actual dataset is very large and a window function speeds up my query. I manage to…
Yoni
  • 11
  • 1
1
vote
1 answer

Reset summation decimal value every time the rounding is greater than or equal to 0.01 with over partition by

I would like to help myself, I have a table with 4 columns, (Id, Grp, Amount, SumMonto). The Amount column is money type and handles 4 decimal places. so values ​​of 0.0001 are stored. I want to add the Cumulative Amount column and every time its…
1
vote
1 answer

How do I write a custom Window Function for Presto?

I'd like to create a custom Window function that processes a sorted stream of doubles and produces a single output double per partition. It is mandatory for values to be sorted. The Window function can process a single row at the time (no need for…