Questions tagged [data-partitioning]

Data partitioning deals with the dividing of a collection of data into smaller collections of data for the purpose of faster processing, easier statistics gathering and smaller memory/persistence footprint.

337 questions
1
vote
1 answer

How does sofs:partitions in Erlang work?

Note: This question is based on rethinking of my previous similar question. I would like to know if Erlang's sofs:partition does the same thing which is described in Wikipedia's page about Set partitions. If it does, how can I get the following…
skanatek
  • 5,133
  • 3
  • 47
  • 75
1
vote
1 answer

How do I generate set partitions of a certain size?

I would like to generate partitions for a set in a specific way: I need to filter out all partitions which are not of size N in the process of generating these partitions. The general solution is "Generate all “unique” subsets of a set (not a…
skanatek
  • 5,133
  • 3
  • 47
  • 75
1
vote
1 answer

Counting integer partitions for which the xor is zero

I'm looking for an efficient way to compute the number of partitions of integer for which the xor is zero: F(n,c) = #{ (x1,x2, ... ,xc) | x1 + x2 + ... + xc = n & x1 xor x2 xor ... xor xc = 0 } For little values of n and c, it's easy to run nested…
1
vote
1 answer

How can I create an index based on values from another column in SQL?

For example if this is my table - SeqNo Gap 20 Start 21 End 29 Start 30 End 42 Start 43 End 49 Start 50 Start 51 Start 52 Start 53 Start 54 Start 55 End 220 Start 221 Start 222 End I want the based on Start and end output…
1
vote
1 answer

A quick way to return all the dates in a table from a database partitioned by DATE and SYMBOL

I have a table in a database partitioned by DATE and SYMBOL, and the DATE column is of the TIMESTAMP type. Is there any faster way to return all the dates in the table than the statements select distinct(date(datetime)) from t and select count(*)…
Eva Gao
  • 402
  • 1
  • 7
1
vote
1 answer

How to find out the type of partitioning in a table in google bigquery using python apis

def partition(dataset1, dataset2): try: client.get_dataset(dataset2) print("Dataset {} already exists".format(dataset2)) except NotFound: print("Dataset {} not found".format(dataset2)) …
1
vote
1 answer

Shard a collection in mongo atlas

Is it possible to Shard a collection in MongoDB atlas? I tried to Shard a collection but when going to enable sharding to my database it gave this error. MongoServerError: (Unauthorized) not authorized on admin to execute command { enableSharding:…
1
vote
1 answer

AWS Athena: Partition projection using date-hour with mixed ranges

I am trying to create an Athena table using partition projection. I am delivering records to S3 using Kinesis Firehouse, grouped using a dynamic partitioning key. For example, the records look like the…
1
vote
1 answer

PSQL determine the min value of date depending on another column

The input table looks like this: ID pid method date 111 A123 credit_card 12-03-2015 111 A128 ACH 11-28-2015 Now for the ID = 111, I need to select the MIN(date) and see what the method of payment for it is. I need the output table to…
moikoi
  • 25
  • 8
1
vote
0 answers

add generated column with aggregated over a partion and sort

I am trying to add a calculated column that computes a rolling average of a sorted partition. I can make it work as a query but cannot seem to get the result to become a calculated field. ALTER TABLE PUBLIC "minutes" ADD COLUMN "green_avg"…
Nuljon
  • 11
  • 1
1
vote
1 answer

How to generate a single file per partition - Snowflake COPY into location

I've managed to unload my data into a partitions, but each one of them is also being partitioned into multiple files. Is there a way to force Snowflake to generate a single file per partition? It also would be great if I can zip all the files. This…
1
vote
2 answers

Can I migrate a partitioned table to a non-partitioned table in Oracle with the CREATE TABLE statement?

I have an Oracle 11g partitioned table with 10 partitions for ten years of data, each on its own tablespace partitioned by range. Each year-partition contains 12 monthly-partitions. I would like to convert this table to a non-partitioned table,…
LBS
  • 518
  • 8
  • 17
1
vote
1 answer

Reading spark partitioned data from directories

My data is partitioned as Year,month,day in s3 Bucket. I have a requirement to read last six months of data everyday.I am using below code to read the data but it is selecting negative value in months. is there a way to read the correct data for…
code_bug
  • 355
  • 1
  • 12
1
vote
3 answers

Partitioning by range columns unexpected behavior

I have MySQL table partitioned by range columns (c_id and created_at) and I created 2 partitions: logs_1_2020 (c_id less than 2 and created less than 2021-01-01 00:00:00) logs_1_2021 (c_id less than 2 and created less than 2022-01-01…
es code
  • 11
  • 3
1
vote
1 answer

spark-cassandra-connector - repartitionByCassandraReplica returns empty RDD - Java

So, I have a 16 node cluster where every node has Spark and Cassandra installed while I am using the Spark-Cassandra Connector 3.0.0. I am trying to join a dataset with a cassandra table on the partition key, while also trying to use…