Questions tagged [partition]

Use this tag for questions about code that partitions data, memory, virtual machines, databases or disks.

In computing, partition may refer to

  • Disk partitioning, the division of a hard disk drive
  • Partition (database), the division of a database
  • Logical partition (virtual computing platform) (LPAR), a subset of a computer's resources, virtualized as a separate computer
  • Memory partition, a subdivision of a computer's memory, usually for use by a single job
  • Binary space partitioning

source: https://en.wikipedia.org/wiki/Partition

Note that non-programming questions about database partitioning are likely to be better received on Database Administrators and disk partitioning on Server Fault.

1547 questions
7
votes
5 answers

Linked list partition function and reversed results

I wrote this F# function to partition a list up to a certain point and no further -- much like a cross between takeWhile and partition. let partitionWhile c l = let rec aux accl accr = match accr with | [] -> (accl, []) |…
Rei Miyasaka
  • 7,007
  • 6
  • 42
  • 69
7
votes
2 answers

Partition data for AWS Athena results in a lot of small files in S3

I have a large dataset (>40G) which I want to store in S3 and then use Athena for query. As suggested by this blog post, I could store my data in the following hierarchical directory structure to enable usingMSCK REPAIR to automatically add…
panc
  • 817
  • 2
  • 14
  • 30
7
votes
1 answer

How does Indexes work with MySql partitioned table

I have a table which contains information about time, So the table has columns like year, month, day, hour and so on. Table has data across years and quite big so i decided to make partition on this table and started learning about Mysql…
Manoj-kr
  • 776
  • 5
  • 18
7
votes
1 answer

How to trace back a large partition of a column family in cassandra

Through ops-center and nodetool cfstats i was able to find that one of the partitions of a keyspace table is 560 Mb, but couldn't find out which partition is that. How can we trace which partition of the table is that big ??
user6288321
  • 365
  • 4
  • 14
7
votes
4 answers

Sort when only equality is available

Suppose we have a vector of pairs: std::vector> v; where for type A only equality is defined: bool operator==(A const & lhs, A const & rhs) { ... } How would you sort it that all pairs with the same first element will end up close?…
pqnet
  • 6,070
  • 1
  • 30
  • 51
6
votes
1 answer

Initial extent size when converting to partitioned table

Working in an Oracle 19c database on Linux x86/64 trying to convert non-partitioned table to partitioned table. Since Oracle12, alter table modify partition has been available to convert non-partitioned tables to partitioned tables. I have a…
6
votes
2 answers

Is it a bad practice to have a Cassandra table with partitions of a single row?

Let's say I have a table like this CREATE TABLE request( transaction_id text, request_date timestamp, data text, PRIMARY KEY (transaction_id) ); The transaction_id is unique, so as far as I understand each partition in this table would…
6
votes
4 answers

Spark aggregate on multiple columns within partition without shuffle

I'm trying to aggregate a dataframe on multiple columns. I know that everything I need for the aggregation is within the partition- that is, there's no need for a shuffle because all of the data for the aggregation are local to the…
1472580
  • 163
  • 1
  • 8
6
votes
2 answers

Creation of a partitioned external table with hive: no data available

I have the following file on HDFS: I create the structure of the external table in Hive: CREATE EXTERNAL TABLE google_analytics( `session` INT) PARTITIONED BY (date_string string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION…
rom
  • 3,592
  • 7
  • 41
  • 71
6
votes
1 answer

Split a set to sub sets using Lists.partition or Iterable.partition

I was wondering what is the efficient way to split a set into sub sets? Iterable> partitions = Iterables.partition(numbers, 10); OR List> partitions = Lists.partition(numbers, 10); What is the difference in Time…
userit1985
  • 961
  • 1
  • 13
  • 28
6
votes
1 answer

Efficient grouping by key using mapPartitions or partitioner in Spark

So, I have a data like the following, [ (1, data1), (1, data2), (2, data3), (1, data4), (2, data5) ] which I want to convert to the following, for further processing. [ (1, [data1, data2, data4]), (2, [data3, data5]) ] I used groupByKey and…
joshsuihn
  • 770
  • 1
  • 10
  • 25
6
votes
0 answers

Seeing "partition doesn't exist" warnings/failures after kafka using kafka partition re-assignment tool

I am using kafka 0.8.1.1. I have a 3 node kafka cluster with some topics having around 5 partitions. I planned to increase the number of nodes to 5 in cluster and moving some partitions from existing topics to the new brokers. Previous partition…
6
votes
2 answers

How to select from partition while concatenating its name in MySQL

I have the same problem like this one: how to select dynamically in select * from partiton (Partition name)? but in Mysql. When using: select concat('p', year(now()), month(now())); Response…
eladelad
  • 99
  • 2
  • 10
6
votes
4 answers

how to get names of partition in oracle while i input a date

I have a table with many partitions range. I need to get the name of all partition when I give a date. For eg: if I input date 20/09/2014, it should list all partitions before that given date. create or replace function get_part_name(p_date in…
Charles Peter
  • 59
  • 1
  • 1
  • 6
6
votes
2 answers

Find Partition Schema Definitions in SQL Server Database

I have access to a database and I need to know the Partition Scheme definitions in the database. i.e. I need to know the partition scheme name, which Partition function is it using, what file groups are the partitions assigned, etc... For example…