Questions tagged [partitioning]

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

The expectation is that with algorithms of order exponentially greater than N the total time it takes to process the smaller groups and combine the results is still less than the time it would take to process the one larger set of data.

Partitioning is similar to range partitioning in many ways. As in partitioning by RANGE, each partition must be explicitly defined.

3138 questions
1
vote
1 answer

How to divide table for parallel loading

Which division option is better (for performance purpose) based on clustered index column or on partition (the same column)? I have to split table to load it in parallel using SSIS (SQL Server 2008R2 Enterprise Edition) to Oracle 11. First option…
PNPTestovir
  • 287
  • 3
  • 5
  • 12
1
vote
1 answer

JSR 352 Multiple Threads in a single partition?

Is it possible to have multiple threads in a single partition? if so, How?
Fazil Hussain
  • 425
  • 3
  • 16
1
vote
1 answer

Spark partitions: creating RDD partitions but not Hive partitions

This is a followup to Save Spark dataframe as dynamic partitioned table in Hive . I tried to use suggestions in the answers but couldn't make it to work in Spark 1.6.1 I am trying to create partitions programmatically from `DataFrame. Here is the…
Sasha O
  • 3,710
  • 2
  • 35
  • 45
1
vote
3 answers

Google Bigquery inconsistent when variable names changes in ORDER BY clause

My goal is to test if the grp's generated by one query, are the same grp's as the output of the same query. However, when I change a single variable name, I get different results. Below I show an example of the same query where we know the results…
cgnorthcutt
  • 3,890
  • 34
  • 41
1
vote
1 answer

Single vs multiple partitions in Hive

Are there any tradeoffs in partitioning using date as a yyyymmdd string versus having multiple partitions for year, month and day as integers?
mottosan
  • 466
  • 3
  • 14
1
vote
2 answers

MySQL Partitioning in a real-world example

I want to try MySQL Partitioning in a football fantasy game where users are distributed in leagues, and each league has a market where users can sell or buy players. I'm experiencing some deadlocks in this table when a lot of users play at the same…
1
vote
1 answer

SQL Server - Create view retrieving most recent value before a given date

Suppose I have the following table in SQL Server (2012): MyTable: Date1: Col1: Val: 1/1/2016 c1 Val1 1/2/2016 c1 Val2 1/3/2016 c2 Val3 1/4/2016 c2 Val4 1/5/2016 c2 Val5 1/6/2016 c3 Val6 1/7/2016…
John Bustos
  • 19,036
  • 17
  • 89
  • 151
1
vote
2 answers

Postgres not using index for range query in partitioned table

I found that Postgres is not using an index for a range query on a partitioned table. The parent table and its partitions have their date column indexed using btree. A query like this: select * from parent_table where date >= '2015-07-01'; does…
1
vote
1 answer

Spark Streaming parallelism with one single key

I have built a prototype application with Spark Streaming in Java which uses HyperLogLog to estimate distinct users from a simulated click stream. Let me briefly sketch my solution. First I create a stream with the KafkaUtils:…
JayKay
  • 152
  • 11
1
vote
1 answer

SQL Server : preferred Index Characteristics/Qualities for Partition Column

I've created a script to search for candidate tables for partitioning, and using the index information, I'd like to find the ideal column on which to partition. I'm ignoring (for now) which columns are most often queried. I have a basic query…
John
  • 434
  • 2
  • 20
1
vote
0 answers

Mysql dynamic table partitioning maintenance procedure using scheduler

I have created a following table partitioning maintenance procedure thats working on unixtime date column for partition table....but it's not creating new partition for non-partitioned table .. when I debug this procedure then I found the reason…
Ankit Agrawal
  • 2,426
  • 1
  • 13
  • 27
1
vote
2 answers

mysql 5.1 partitioning - do I have to remove the index/key element?

I have a table with several indexes. All of them contain an specific integer column. I'm moving to mysql 5.1 and about to partition the table by this column. Do I still have to keep this column as key in my indexes or I can remove it since…
Nir
  • 24,619
  • 25
  • 81
  • 117
1
vote
0 answers

What is the algorithm of OrientDB partitioning?

I can't find the partitioning algorithm which is supported by OrientDB. I need a graph database which is supports clever algorithm of partitioning or rebalancing to decrease the number of cutted edges (edge which points on another server). Because I…
1
vote
0 answers

SQL Server - Computed column counting over partition

I am fairly new to creating tables in SQL Server - Especially to computed columns - And am looking to make sure that I'm not creating a terribly inefficient database. As a simplified example of what I'm trying to accomplish, suppose I have the…
John Bustos
  • 19,036
  • 17
  • 89
  • 151
1
vote
1 answer

Why are the Atomic Host CentOS automatic partitions like they are?

I've been tasked with working with the CentOS Atomic Host distribution which comes preinstalled with Docker. My problem is I can pull from a host registry without a problem (but I don't know where it's stored), but what I'd really like to do is…
user3000724
  • 651
  • 3
  • 11
  • 22