Questions tagged [partition-by]
266 questions
1
vote
2 answers
Oracle "connect by prior" along with "max() over partition by" to remove duplicate subtree
I am trying to remove duplicate subtree returned by "connect by prior" clause. I wish to check top level node of tree hierarchy, where user can enter child IDs which are already part of subtree. Take a look at following example:
SELECT * FROM (
With…

user613114
- 2,731
- 11
- 47
- 73
1
vote
0 answers
SQLite rank() over partition by support in c# System.Data.SQLite and Mono.Data.Sqlite
I am using sqlite in one of my application. I need to use rank() over partition by. I am using System.Data.SQLite provider. Does it support rank and other window functions? I am getting error while using this. Its showing
SQL logic error near "(":…

maeenul
- 91
- 2
- 5
1
vote
2 answers
SQL to calculate cumulative sum that resets based on previous value in a column in Hive
I am trying to create a cumulative value with something like this
KEY1 Date_ VAL1 CUMU_VAL2
K1 D1 1 0
K1 D2 1 1
K1 D3 0 2
K1 D4 1 0
K1 D5 1 1
So, the issue is…

yabtzey
- 391
- 1
- 7
- 17
1
vote
2 answers
How to select a single value from a column and use partition by to replicated over another column?
I want to create a new column called NewHeight that takes the single value found in the Height column and replicates it for all the same Name.
This is the query I am using.
Select
Name,
Height,
case when Height is not null then max(Height)…

luanstat
- 65
- 1
- 9
1
vote
1 answer
Writing Dataset/Dataframe to HDFS directory without name of the column after partitionBy() :Spark 2.1.0
I am using partitionBy() to partition the dataset/dataframe by some ID. But, when the file is written it is creating the directory with name and value of the column seperated by "=" symbol.
Seq partitionCols =…

wandermonk
- 6,856
- 6
- 43
- 93
1
vote
1 answer
How to group by contiguous data only in Oracle
I'm trying to display how many cages for each rate and the start and end date. This is going into an invoice where we bill for each cage per day and these cages can have different rates.
It isn't a simple GROUP BY and getting the MIN and MAX dates…

dokgu
- 4,957
- 3
- 39
- 77
1
vote
1 answer
Partition by to get the correct date
I would like to retrieve the oldest date under the FirstScanned Column. My issue is that a product can be registered again at some point as a new product, when this happens, I'd like the FirstScanned to be retrieve the oldest scanned date based on…

pancake
- 590
- 7
- 24
1
vote
1 answer
Date difference between rows by partition or level SQL
I am using Microsoft SQL Server. I have a table user ID's and login dates. I have the dates originally stored as Datetime, but I converted them to date. I need show the date difference in days between each login grouped by user ID; I want the…

MrCPT
- 37
- 1
- 7
1
vote
2 answers
Best way to select most recent record
So the database I am using does not have a great way to select the most recent number by its unique ID. We have to narrow down to get the most recent record with a bunch of sub queries joining back to the original table. The original table is…
user7966039
1
vote
0 answers
PySpark PartitionBy Timing out while writing S3 files
Following code is trying to take a bunch of files from one input S3 path and then write them into individual S3 folders with folder name as a date column in the input data.
from pyspark import SparkContext
from pyspark.sql.types import *
from…

androboy
- 817
- 1
- 12
- 24
1
vote
3 answers
Partition By over Two Columns in Row_Number function
I am trying to RANK the records using the following query:
SELECT
ROW_NUMBER() over (partition by
TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate
order by TW.EMPL_ID,TW.Effective_Bdate)…

Whitewolf
- 186
- 2
- 5
- 14
1
vote
1 answer
Specified partition columns do not match the partition columns of the table.Please use () as the partition columns
here i'm trying to persist the data frame in to a partitioned hive table and getting this silly exception. I have looked in to it many times but not able to find the fault.
org.apache.spark.sql.AnalysisException: Specified partition columns
…

jack AKA karthik
- 885
- 3
- 15
- 30
1
vote
2 answers
Select upload history from Oracle table
I have a table contains a list of factories (factory Id and factory names)
Factory ID Factory Name
1001 Factory1
1002 Factory2
1003 Factory3
User will upload the some data into a history table for each factory, each…

Robin clave
- 608
- 6
- 16
- 32
1
vote
1 answer
Group By records based on 10 second time interval in SQL server
Requirement is to Group record of table based on 10 second time interval. Given table
Id DateTime Rank
1 2011-09-27 18:36:15 1
2 2011-09-27 18:36:15 1
3 2011-09-27 18:36:19 1
4 2011-09-27 18:36:23 …

agm92
- 314
- 1
- 3
- 11
1
vote
1 answer
using partition by, group by and count together
Blockquote
Copy/Paste data:
productNumber Account BuyDate SellDate RelevantYYYYMM
27530 8605148 11/5/15 7:03 PM 11/11/15 2:02 PM 201511
27530 8582063 10/14/15 7:17 PM 10/16/15 1:48 PM 201510
27530 8566748 9/30/15 6:51 PM…

NonProgrammer
- 1,337
- 2
- 23
- 53