Questions tagged [partitioning]

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

The expectation is that with algorithms of order exponentially greater than N the total time it takes to process the smaller groups and combine the results is still less than the time it would take to process the one larger set of data.

Partitioning is similar to range partitioning in many ways. As in partitioning by RANGE, each partition must be explicitly defined.

3138 questions
1
vote
1 answer

Postgres GRANT not applied on parent

I'm in trouble with grant in postgresql (version 9.3). I'm trying to restrict a ROLE 'client_1'. I want it to be able to do only select for one table. But there is inheritance between tables. Here is my table structure: CREATE TABLE public.table_a…
firetonton
  • 324
  • 1
  • 3
  • 13
1
vote
1 answer

How to store range/hash composite partitions in separate datafiles by range?

I'm creating a database which will utilize composite partitioning. I will partition one table using range partitioning (by date) and then further subpartition it by hash (by client id). So far so good, no problem, but I also need to have those…
rattaman
  • 506
  • 3
  • 15
1
vote
1 answer

How does index assignment to nodes work in Spark with mapPartitionsWithIndex()?

I am trying to coordinate GPU execution on a Spark cluster. In order to achieve this I need each task/partition to only use a specific GPU slot per system. Each system has 4 GPUs, and the easiest way I have found to achieve this is by doing a…
alfredox
  • 4,082
  • 6
  • 21
  • 29
1
vote
1 answer

Partition existing PostgreSQL table

I have a huge table (~500M rows), which I did not partition at the time of loading the data. If I create the partitions now, do I need to manually move the data from the master table to the child tables? Are there any better options.
let_there_be_light
  • 837
  • 3
  • 9
  • 15
1
vote
0 answers

HDInsight, Hive partitioning and bucketing (big data)

I hope you are all fine, I have a requirement of saving logs (a huge quantity) into HDInsight (into blobs and then using hive in order to query them via some BI Analytics software) For a day I have something like 30 millons of .json archives. The…
Pablo Morelli
  • 123
  • 3
  • 8
1
vote
1 answer

Before and After trigger on the same event? Fill a child table PostgreSQL

Situation I have a database in PostgreSQL 9.5 used to store object locations by time. I have a main table named "position" with the columns (only relevant): position_id position_timestamp object_id It is partitioned into 100 child tables on…
1
vote
1 answer

Did I beat the CAP Theorem with this master-slaves distributed system (with picture)?

I was watching this video about the CAP theorem, where the author explains well the trade-offs of distributed systems. However I disagree with the CAP theorem in the following aspect. Given the picture below: Whenever there is a partition, in other…
1
vote
1 answer

How to guarantee repartitioning in Spark Dataframe

I'm pretty new to Apache Spark and I'm trying to repartition a dataframe by U.S. State. I then want to break each partition into its own RDD and save to a specific location: schema = types.StructType([ types.StructField("details",…
kellanburket
  • 12,250
  • 3
  • 46
  • 73
1
vote
0 answers

linux lvm partition using partman in preseed

I'm trying to customise preseed on ubuntu 14.04. Where all the parameter required for installation are stored. At the time of first OS boot value and variables are exported, and configuration is completed without any manual intervention . During…
suhas
  • 733
  • 5
  • 13
1
vote
1 answer

Graph partitioning in groups of n vertices each

Is there any graph partitioning method that can partition a graph in groups of maximum n vertices. Example : I have a graph with 1000 vertices and I want to partition it in subgraphs with maximum 100 vertices. There can be 2 subgraphs with 50…
vladg
  • 13
  • 1
  • 3
1
vote
0 answers

Array partitioning — Minimizing maximum sum vs. minimizing absolute value difference of sums

Are these two problems isomorphic to each other? — (1) Finding the position in an array that splits it into two partitions minimizing the maximum of the sum between the two partitions. (2) Finding the position in an array that splits it into two…
1110101001
  • 4,662
  • 7
  • 26
  • 48
1
vote
2 answers

How to create a partition in Azure SQL Table

I am going to create a SQL tables in Azure SQL database, and I want to create a partition of table, but I don't know how to do that, can any one show me some demo example or query to perform this. I am using SQL management studio to connect my Azure…
Vinit Patel
  • 2,408
  • 5
  • 28
  • 53
1
vote
0 answers

Create JavaPairRDD from a collection with a custom partitioner

Is it possible to create a JavaPairRDD from a List> with a specified partitioner? the method parallelizePairs in JavaSparkContext only takes the number of slices and does not allow using a custom partitioner. Invoking…
1
vote
1 answer

MySQL - Uneven Distribution of Data into Partitions When Using Key Partitioning

I'm using the InnoDB engine on MySQL 5.7. I have a table where one of the columns is a (non-unique) three-letter country code (e.g. "SGP" for Singapore, "JPN" for Japan, etc). For most of my queries, this country code column is the first WHERE…
Edwin Lee
  • 3,540
  • 6
  • 29
  • 36
1
vote
0 answers

Why, in this example, is the Hoare partition algorithm not returning correct pivot index position?

Let say I have the following array : 3, 6, 9, 1, 4 I want to partition it around a pivot, using Hoare algorithm : Hoare-Partition (A, p, r) x ← A[p] i ← p − 1 j ← r + 1 while TRUE repeat j ← j − 1 until …
tigrou
  • 4,236
  • 5
  • 33
  • 59