Questions tagged [partition-by]

266 questions
1
vote
2 answers

Oracle "connect by prior" along with "max() over partition by" to remove duplicate subtree

I am trying to remove duplicate subtree returned by "connect by prior" clause. I wish to check top level node of tree hierarchy, where user can enter child IDs which are already part of subtree. Take a look at following example: SELECT * FROM ( With…
user613114
  • 2,731
  • 11
  • 47
  • 73
1
vote
0 answers

SQLite rank() over partition by support in c# System.Data.SQLite and Mono.Data.Sqlite

I am using sqlite in one of my application. I need to use rank() over partition by. I am using System.Data.SQLite provider. Does it support rank and other window functions? I am getting error while using this. Its showing SQL logic error near "(":…
maeenul
  • 91
  • 2
  • 5
1
vote
2 answers

SQL to calculate cumulative sum that resets based on previous value in a column in Hive

I am trying to create a cumulative value with something like this KEY1 Date_ VAL1 CUMU_VAL2 K1 D1 1 0 K1 D2 1 1 K1 D3 0 2 K1 D4 1 0 K1 D5 1 1 So, the issue is…
yabtzey
  • 391
  • 1
  • 7
  • 17
1
vote
2 answers

How to select a single value from a column and use partition by to replicated over another column?

I want to create a new column called NewHeight that takes the single value found in the Height column and replicates it for all the same Name. This is the query I am using. Select Name, Height, case when Height is not null then max(Height)…
luanstat
  • 65
  • 1
  • 9
1
vote
1 answer

Writing Dataset/Dataframe to HDFS directory without name of the column after partitionBy() :Spark 2.1.0

I am using partitionBy() to partition the dataset/dataframe by some ID. But, when the file is written it is creating the directory with name and value of the column seperated by "=" symbol. Seq partitionCols =…
wandermonk
  • 6,856
  • 6
  • 43
  • 93
1
vote
1 answer

How to group by contiguous data only in Oracle

I'm trying to display how many cages for each rate and the start and end date. This is going into an invoice where we bill for each cage per day and these cages can have different rates. It isn't a simple GROUP BY and getting the MIN and MAX dates…
dokgu
  • 4,957
  • 3
  • 39
  • 77
1
vote
1 answer

Partition by to get the correct date

I would like to retrieve the oldest date under the FirstScanned Column. My issue is that a product can be registered again at some point as a new product, when this happens, I'd like the FirstScanned to be retrieve the oldest scanned date based on…
pancake
  • 590
  • 7
  • 24
1
vote
1 answer

Date difference between rows by partition or level SQL

I am using Microsoft SQL Server. I have a table user ID's and login dates. I have the dates originally stored as Datetime, but I converted them to date. I need show the date difference in days between each login grouped by user ID; I want the…
MrCPT
  • 37
  • 1
  • 7
1
vote
2 answers

Best way to select most recent record

So the database I am using does not have a great way to select the most recent number by its unique ID. We have to narrow down to get the most recent record with a bunch of sub queries joining back to the original table. The original table is…
user7966039
1
vote
0 answers

PySpark PartitionBy Timing out while writing S3 files

Following code is trying to take a bunch of files from one input S3 path and then write them into individual S3 folders with folder name as a date column in the input data. from pyspark import SparkContext from pyspark.sql.types import * from…
androboy
  • 817
  • 1
  • 12
  • 24
1
vote
3 answers

Partition By over Two Columns in Row_Number function

I am trying to RANK the records using the following query: SELECT ROW_NUMBER() over (partition by TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate order by TW.EMPL_ID,TW.Effective_Bdate)…
Whitewolf
  • 186
  • 2
  • 5
  • 14
1
vote
1 answer

Specified partition columns do not match the partition columns of the table.Please use () as the partition columns

here i'm trying to persist the data frame in to a partitioned hive table and getting this silly exception. I have looked in to it many times but not able to find the fault. org.apache.spark.sql.AnalysisException: Specified partition columns …
jack AKA karthik
  • 885
  • 3
  • 15
  • 30
1
vote
2 answers

Select upload history from Oracle table

I have a table contains a list of factories (factory Id and factory names) Factory ID Factory Name 1001 Factory1 1002 Factory2 1003 Factory3 User will upload the some data into a history table for each factory, each…
Robin clave
  • 608
  • 6
  • 16
  • 32
1
vote
1 answer

Group By records based on 10 second time interval in SQL server

Requirement is to Group record of table based on 10 second time interval. Given table Id DateTime Rank 1 2011-09-27 18:36:15 1 2 2011-09-27 18:36:15 1 3 2011-09-27 18:36:19 1 4 2011-09-27 18:36:23 …
agm92
  • 314
  • 1
  • 3
  • 11
1
vote
1 answer

using partition by, group by and count together

Blockquote Copy/Paste data: productNumber Account BuyDate SellDate RelevantYYYYMM 27530 8605148 11/5/15 7:03 PM 11/11/15 2:02 PM 201511 27530 8582063 10/14/15 7:17 PM 10/16/15 1:48 PM 201510 27530 8566748 9/30/15 6:51 PM…
NonProgrammer
  • 1,337
  • 2
  • 23
  • 53