Questions tagged [partitioning]

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

The expectation is that with algorithms of order exponentially greater than N the total time it takes to process the smaller groups and combine the results is still less than the time it would take to process the one larger set of data.

Partitioning is similar to range partitioning in many ways. As in partitioning by RANGE, each partition must be explicitly defined.

3138 questions

vote

1 answer

Partition a column based on missing values

How would I go about partitioning a column based on missing value in python. I have have the following table in a dataframe: Store Bag Alberts ClothBag Vons KateSpade Ralphs GroceryBag1 Na apple Na pear Na …

python pandas partitioning

asked Sep 29 '17 at 22:39

Harold Chaw

vote

1 answer

HBase: All data stored in one region

I'm importing HFiles into HBase using the command: hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles -Dcreate.table=no /user/myuser/map_data/hfiles my_table When I just had a look into the HBase Master UI, I saw that all data seems to…

apache-spark hbase partitioning

asked Sep 29 '17 at 13:28

D. Müller

3,336
4
36
84

vote

1 answer

Missing FROM-clause entry for table »rowtype«

I am currently writing a function in the plpgsql language to create partitions which will hold sensor data for each month (one partition for one month and sensor). I am stuck with this error: ERROR: missing FROM-clause entry for table…

postgresql plpgsql partitioning dynamic-sql postgresql-9.4

asked Sep 28 '17 at 14:34

bajro

1,199
3
20
33

vote

1 answer

Choosing random Pivot in QuickSort partitioning takes more time, how is this possible?

public static int partitionsimple_hoare(int[] arr,int l , int h){ int pivot = arr[l]; int i = l-1; int j = h+1; while(true){ do{ i++; }while(arr[i]

java random quicksort partitioning

asked Sep 21 '17 at 10:22

Akarsh Rastogi

vote

1 answer

Count of sales partitioned by DOW (with date and time as input) - postgresql

Have scoured the internet for right response, but am not finding what I want. I have an example dataset as follows: Date --------------------------------- Number of Sales Saturday 9th September 13:22:00 ------ 1 Sunday 10th September 16:44:02 …

postgresql partitioning postgresql-9.5 weekday

asked Sep 13 '17 at 11:22

user8497255

vote

1 answer

Preserving the number of partitions of a Spark dataframe after transformation

I am looking at a bug in the code where a dataframe has been split into too many partitions than desired (over 700), and this causes too many shuffle operations when I try to repartition them to only 48. I can't use a coalesce() here because I want…

apache-spark apache-spark-sql partitioning data-partitioning

asked Sep 12 '17 at 17:21

John Subas

vote

3 answers

T-SQL progressive numbering partitions

I am aiming to obtain a record set like this date flag number 01 0 1 02 0 1 03 1 2 04 1 2 05 1 2 06 0 3 07 1 4 08 1 4 I start from the record set with "date" and "flag"…

sql-server tsql sql-server-2012 partitioning ranking

asked Sep 11 '17 at 10:14

RaffaeleT

vote

1 answer

Best practices on Hazelcast persistance and multiple members

I went through several related topics here and it seems the topic is still open, official documentation does not cover it so here we are. There's a cluster with N members in one group There's one distributed map The map has persistence store backed…

distributed-computing partitioning hazelcast in-memory-database

asked Sep 07 '17 at 18:07

Viktor Stolbin

2,899
4
32
53

vote

1 answer

Select parquet based on partition date

I've some heavy logs on my cluster, I've parqueted all of them with the following partition schema: PARTITION_YEAR=2017/PARTITION_MONTH=07/PARTITION_DAY=12 For example, if I want to select all my log between 2017/07/12 and 2017/08/10 is there a way…

apache-spark pyspark partitioning parquet

asked Sep 04 '17 at 14:34

RobinFrcd

4,439
4
25
49

vote

1 answer

ORA-14108: illegal partition-extended table name syntax

I have a requirement where I need to run a update script over multiple partitions of a table . I have written a script for it as below: but it gives ORA-14108: illegal partition-extended table name syntax Cause: Partition to be accessed may only be…

database oracle plsql oracle11g partitioning

asked Sep 02 '17 at 11:57

Mayank Mukherjee

vote

3 answers

T-SQL group by partition

I have below table in SQL server 2008.Please help to get expected output Thanks. CREATE TABLE [dbo].[Test]([Category] [varchar](10) NULL,[Value] [int] NULL, [Weightage] [int] NULL,[Rn] [smallint] NULL ) ON [PRIMARY] insert into Test values…

tsql group-by partitioning

asked Jan 04 '11 at 19:36

user219628

3,755
8
35
37

vote

0 answers

Oracle subpartition a subpartition

I have 3 columns which I would like to partition by, let's call them some_date DATE some_type VARCHAR2 some_product VARCHAR2 I would like to partition by range using some_date, then subpartition by list using some_type, then subpartition that…

oracle list range partitioning

asked Aug 22 '17 at 09:26

Jacek Trociński

vote

3 answers

Split a list into all pairs in all possible ways

I am aware of many posts with the similar questions and have been through all of them. However, I am not able to do what I need. I have list say l1=[0,1,2,3,4] which I want to partition into pair of tuples like following: [(0, 1), (2, 3), 4], [(0,…

python list partitioning

asked Aug 16 '17 at 03:54

Pankaj

vote

0 answers

Optimal partitioning

enter image description hereI'm looking for a way to perform optimal partitioning of the following: I have a square that is divided into a number of small equally-sized squares, say N and I need to group them into K groups so that the number of…

optimization partitioning

asked Jul 23 '17 at 14:41

user3657828

vote

2 answers

Cassandra querying multiple partitions on a single node

We have less than 50GB of data for a table and we are trying to come up with a reasonable design for our Cassandra database. With so little data we are thinking of having all data on each node (2 node cluster with replication factor of 2 to start…

cassandra nodes partitioning cassandra-3.0

asked Jul 20 '17 at 08:18

eddyP23

6,420
7
49
87

Prev 1 2 3

…

99 100 Next