Questions tagged [partitioning]

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

The expectation is that with algorithms of order exponentially greater than N the total time it takes to process the smaller groups and combine the results is still less than the time it would take to process the one larger set of data.

Partitioning is similar to range partitioning in many ways. As in partitioning by RANGE, each partition must be explicitly defined.

3138 questions
1
vote
0 answers

Issues with partitioning

Looking for scale out option for my SQL server database. I have a table that has 15 million rows and it adds 10,000 rows every day. From various options available to scale out I opted for partitioning, as changes need to be done only in database and…
1
vote
1 answer

MRJob same key gets sent to different reducers

So I have Hadoop 2.7.1 installed on a 3 machine cluster. I'm trying to run an inverted index mapreduce job using MRJob and Hadoop Streaming. Here's my configuration: MRJob.SORT_VALUES = True def steps(self): JOBCONF_STEP1 = { …
Jack
  • 486
  • 2
  • 5
  • 19
1
vote
1 answer

How to triple-boot Debian/Arch linux/Ubuntu

I've been trying recently to partition my HDD to triple-boot Debian, Arch Linux and Ubuntu. Do i need to make a boot partition, a root partition etc. for all the OSes or I just need to have one boot partition? How can i partition my hard drive to…
1
vote
4 answers

Selecting pairs of rows in one row SQL

So I have this event audit table EventID | EventType | TaskID | Date | Iteration -------------------------------------------------------------- 1 | start | 12 | 01/01/2016 09:00 | 1 …
ifuwannaride
  • 121
  • 3
  • 13
1
vote
1 answer

BQ Partitioning by column instead of date

I'm trying to partition my tables in BQ, I've read the documentation and it always points to timePartition. I understand that this may be the default partition, but is it possible to define your table's column/s as the partition? Any inputs would…
C. Mags
  • 23
  • 2
1
vote
2 answers

loading data to hive dynamic partitioned tables

I have created a hive table with dynamic partitioning on a column. Is there a way to directly load the data from files using "LOAD DATA" statement? Or do we have to only depend on creating a non-partitioned intermediate table and load file data to…
1
vote
1 answer

Partition MySQL table by Column Value

I have a MySQL table with 20 million rows. I want to partition to boost speed. The table is in the following format: column column column sector data data data Capital Goods data data data Transportation data …
Ned Hulton
  • 477
  • 3
  • 12
  • 27
1
vote
1 answer

Postgres: BEFORE UPDATE trigger

Description In our environment (Postgres 9.3) we use extensive partitioning on dates. Additionally we use redirects to redirect INSERTs in the 'main' table to the corresponding child table (so due note that there actually is no data in the main…
Korenaga
  • 339
  • 2
  • 9
1
vote
1 answer

Is there a polynomial time algorithm to know whether a set of integers can be partitioned into two of equal sum?

If there are, it would be my great pleasure if anyone can direct me to any. Preferable with a computer program that works for that purpose. I'm actually referring to a polynomial time algorithm that will only test (without the actual partitioning)…
rosacart
  • 23
  • 5
1
vote
1 answer

Querying data from har archives - Apache Hive

I am using Hadoop and facing the dreaded problem of large numbers of small files. I need to be able to create har archives out of existing hive partitions and query them at the same time. However, Hive apparently supports archiving partitions only…
Ankit Khettry
  • 997
  • 1
  • 13
  • 33
1
vote
1 answer

Using JedisCluster to write to a partition in a Redis Cluster

I have a Redis Cluster. I am using JedisCluster client to connect to my Redis. My application is a bit complex and I want to basically control to which partition data from my application goes. For example, my application consists of sub-module A,…
DTCool
  • 33
  • 6
1
vote
0 answers

SSAS: Slower processing when measure groups are partitioned

I have 1 year of data and tried partitioning large measure groups per month. The oblivious reason why we do partitioning is to process only the latest data coming in. But there are instances that we need to process the whole cube/all partitions. The…
ggarcia
  • 47
  • 1
  • 10
1
vote
2 answers

How to divide players into divisions?

Lets say we have a two players game, where one player always wins (there can't be draw). The question is: How to divide n players into k divisions if we don't know anything about their skills? Each division should consist of the same number of…
Tomek Tarczynski
  • 2,785
  • 8
  • 37
  • 43
1
vote
1 answer

Migrate Oracle partitioned tables to SQL Server

I need to migrate about 700 Oracle partitioned tables (RANGE and LIST partitioning) to SQL Server. Turns out the SSMA (SQL Server Migration Assistant) does not handle Oracle partitioned tables (this is the official answer I got from Microsoft). Any…
1
vote
1 answer

Apache Spark: Join two RDDs with different partitioners

I have 2 rdds with different set of partitioners. case class Person(name: String, age: Int, school: String) case class School(name: String, address: String) rdd1 is the RDD of Person, which I have partitioned based on age of the person, and then…
shashwat
  • 81
  • 5