Questions tagged [partitioning]

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

Partitioning is a performance strategy whereby you divide possibly very large groups of data into some number of smaller groups of data.

The expectation is that with algorithms of order exponentially greater than N the total time it takes to process the smaller groups and combine the results is still less than the time it would take to process the one larger set of data.

Partitioning is similar to range partitioning in many ways. As in partitioning by RANGE, each partition must be explicitly defined.

3138 questions
1
vote
3 answers

Switch Partitioning failing due to mismatch in file groups of source and destination tables

I am trying to implement switch partitioning on one of the tables and I made sure that the partition function,scheme,file groups are working fine. But I get the file group error when I run the below command. Can someone share your thoughts on…
Teja
  • 13,214
  • 36
  • 93
  • 155
1
vote
1 answer

Interval partitioning on varchar2 column

I have a huge table that holds a lot of data - history, and current. I Have an automatic job that move old data from some tables to historic tables (and then delete from the source). So I want to build an interval-partition table which get the…
user2671057
  • 1,411
  • 2
  • 25
  • 43
1
vote
2 answers

MySQL partition and unique key

We are having a table like this to save login tokens per user sessions. This table was not partitioned earlier but now we decided to partition it to improve performance as it contains over a few millions rows. CREATE TABLE `tokens` ( `id`…
mesibo
  • 3,970
  • 6
  • 25
  • 43
1
vote
1 answer

Ranking over multiple columns in pandas

I have this data frame: dict_data = {'id' : [1,1,1,2,2,2,2,2], 'datetime' : np.array(['2016-01-03T16:05:52.000000000', '2016-01-03T16:05:52.000000000', '2016-01-03T16:05:52.000000000', '2016-01-27T15:45:20.000000000', …
Nick
  • 21
  • 5
1
vote
1 answer

Scaling a follower model

The problem is somewhat similar to twitter/facebook's: followers and following users add items Subsequently you see the items added by all the people you are following. Problem A: how to keep the query for items added by people you are following…
Thierry
  • 3,225
  • 1
  • 26
  • 26
1
vote
0 answers

How to arrange multi partitions in hive?

say i have a order table, which contains multi time column(spend_time,expire_time,withdraw_time), usually,i will query the table with the above column independently,so how do i create the partitions? order_no | spend_time | expire_time |…
lei yu
  • 58
  • 6
1
vote
0 answers

Table and Index Partitioning & Filtered Index in SQL Server 2016 SP1

Table and Index Partitioning I am planning to use table partitioning for one of my existing databases. All the tables in the database have a clustered index and a non-unique non-clustered index. The non-unique non-clustered index is built on the…
DBK
  • 403
  • 4
  • 13
1
vote
2 answers

Why can't direct routing be used for distributed data with a secondary index?

I'm reading the following article: Elements of Scale: Composing and Scaling Data platforms I'm stuck on understanding the following sentences: A secondary index is an index that isn’t on the primary key. This means the data will not be partitioned…
1
vote
1 answer

Why the same HashPartitioner applied on two RDDs with same keys doesn't partition equally

I have two RDDs with same keys and different values. I call on both of them the same .partitionBy(partitioner) and then I join them: val partitioner = new HashPartitioner(partitions = 4) val a = spark.sparkContext.makeRDD(Seq( (1, "A"), (2, "B"),…
fpopic
  • 1,756
  • 3
  • 28
  • 40
1
vote
1 answer

Palindrome partitioning with interval scheduling

So I was looking at the various algorithms of solving Palindrome partitioning problem. Like for a string "banana" minimum no of cuts so that each sub-string is a palindrome is 1 i.e. "b|anana" Now I tried solving this problem using interval…
quintin
  • 812
  • 1
  • 10
  • 35
1
vote
1 answer

Partition algorithm without loop, only recursion

Given a list of integers. Find out whether it can be divided into two sublists with equal sum. The numbers in the list is not sorted. For example: A list like [1, 2, 3, 4, 6] will return true, because 2 + 6 = 1 + 3 + 4 = 8 A list like [2, 1, 8, 3]…
1
vote
1 answer

Iterative partitioning and labeling using data.table

I have an iterative method of partitioning that assigns a label to each observation and continues until all partitions are less than or equal to the specified minimum observations. Using data.table I have run into issues incorporating '{' and ':='. …
Fred Viole
  • 153
  • 7
1
vote
1 answer

Postgres constraint exclusion for parameterised, prepared query

As of Postgres 9.2, constraint exclusion can now be performed on constraints that use parameterised values (see 5.9.6 Caveats). However my guess is that this would not apply to a prepared statement with a parameterized constraint, as query planning…
1
vote
1 answer

Finding the kth smallest element in an unsorted array

I'm trying to implement the following pseudocode. I need to do this using logical partitions only. Procedure SELECT( k,S) { if |S| =1 then return the single element in S else { choose an element a randomly from S; let S1,S2,and S3 be…
Flaom
  • 134
  • 11
1
vote
0 answers

PostgreSQL: Table has type character varying, but query expects timestamp without time zone

In a PostgreSQL database, when I tried to make use of COPY function to insert a timestamp value into a table that partitoined with that timestamp field, I got blelow error message: ERROR: attribute 11 has wrong type DETAIL: Table has type…