Questions tagged [partitioning]

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

The expectation is that with algorithms of order exponentially greater than N the total time it takes to process the smaller groups and combine the results is still less than the time it would take to process the one larger set of data.

Partitioning is similar to range partitioning in many ways. As in partitioning by RANGE, each partition must be explicitly defined.

3138 questions
1
vote
1 answer

How to do Partition for mysql Hierarchy Table structure

I am building a classifieds website. Can you suggest me how do I partition geolocation table based on country ??? UPDATE: Can I do based on latitude and longitude ? Below is my table structure. CREATE TABLE IF NOT EXISTS `geo` ( `id` int(11) NOT…
kulls
  • 845
  • 2
  • 12
  • 37
1
vote
1 answer

Why MySql can not use PARTITION pruning with INSERT statements?

Here is a sentence: MySQL can apply partition pruning to SELECT, DELETE, and UPDATE statements. INSERT statements currently cannot be pruned. So when a new row is inserted MySql can not determine what partition it belongs? Sound very strange. Is…
Cherry
  • 31,309
  • 66
  • 224
  • 364
1
vote
3 answers

ReduceByKey function In Spark

I've read somewhere that for operations that act on a single RDD, such as reduceByKey(), running on a pre-partitioned RDD will cause all the values for each key to be computed locally on a single machine, requiring only the final, locally reduced…
Nick
  • 2,818
  • 5
  • 42
  • 60
1
vote
2 answers

Partitioning possible on already existing table?

I want to know if i can use partitioning on an already existing table. I have a table which i want to divide on the basis of values in one column. I have some tickets(t_id) which are of different accounts(a_id). I want to divide the tables based on…
olivia
  • 119
  • 1
  • 2
  • 9
1
vote
1 answer

Service Fabric Reliable Services: Communication and Partitioning essentials

While discovering SF Reliable Services I want to make sure that next basic statements are true. Reliable Services Default Communication stack (DefaultStack) and Reliable Actors Communication stack (using ServiceProxy/ActorProxy) can only be used…
AsValeO
  • 2,859
  • 3
  • 27
  • 64
1
vote
0 answers

Array partitioning method I used in Java - should have worked, but I'm getting a java.lang.ArrayIndexOutOfBoundsException

I thought I could just share some code I just decided to do for a bit of fun. It deals with the problem of array partitioning and I have used Java to implement the solution. I am originally a C/C++ guy, and I would like some guidance as to the…
1
vote
1 answer

SQL Server and TPC-H Table Partitioning Performance Analysis smaller partitions, fewer reads, higher cpu costs

i'm using TPC-H (SF 10) on my SQL Server 2014 database system. In order to improve query performance I decided to partition (same disk) two of the biggest tables (Lineitem and Orders) by the date column, cause many of those queries use a date range.…
1
vote
1 answer

How values get inserted into Mysql hash partitioned table?

I have created a mysql table and hash partitioned it as below. mysql> CREATE TABLE employees ( id INT NOT NULL, fname VARCHAR(30), lname VARCHAR(30), hired DATE NOT NULL DEFAULT…
user5489250
1
vote
2 answers

Partitioning the data based on column values

Hi I have data source as following ID Date Page 100 27-10-2015 google 102 27-10-2015 facebook 102 27-10-2015 instagram 104 28-10-2015 yahoo 105 30-10-2015 bing I want to store this…
wazza
  • 770
  • 5
  • 17
  • 42
1
vote
1 answer

Dynamic partition in hive

I have created a table with dynamic partition in hive as below create table sample(uuid String,date String,Name String,EmailID String,Comments String,CompanyName String,country String,url String,keyword String,source String) PARTITIONED BY (id…
wazza
  • 770
  • 5
  • 17
  • 42
1
vote
1 answer

MySQL Partition By Both DATE and INT

I have a table I want to partition using MySQL 5.7 Partitioning to mitigate issues I'm having with dropping old data quickly. (Also, it would be nice to have increased insert I/O performance by partitioning across something other than date,…
gfunk
  • 381
  • 1
  • 14
1
vote
1 answer

BigQuery - Custom Lag Offset while using Lag function

I have a BigQuery table as below: date hits_eventInfo_Category hits_eventInfo_Action session_id user_id hits_time hits_eventInfo_Label 20151021 Air Search 1445001 A232 1952 City1 20151021 Air Select 1445001 A232 2300 …
activelearner
  • 7,055
  • 20
  • 53
  • 94
1
vote
3 answers

Would partitioning the table improve the performance of this GROUP BY query?

I have a MySQL table say data_table mysql> desc data_table; +------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra …
Optimus
  • 2,716
  • 4
  • 29
  • 49
1
vote
0 answers

Partition pruning with impala and parquet

We have a fact table we wish to partition by month. (This is because of our quantity of data, and wanting to hit partition file sizes that are at least 256mb as per parquet best practice). I guess if data increases we may want to go weekly. The…
Codek
  • 5,114
  • 3
  • 24
  • 38
1
vote
1 answer

Creating sessions with conditional events

I have a list of web browsing data that I'm attempting to convert into a sessions. An example dataset from a user: time_millis Type Result 07/10/2015 08:31 1 0 07/10/2015 08:41 1 0 07/10/2015 08:48 2 0 07/10/2015 08:50 2 …
user3937831
1 2 3
99
100