Questions tagged [data-partitioning]

Data partitioning deals with the dividing of a collection of data into smaller collections of data for the purpose of faster processing, easier statistics gathering and smaller memory/persistence footprint.

337 questions
0
votes
1 answer

Generating multiple equally sized output files in Hadoop

What are some methods for finding X data ranges in Hadoop so that one can use these ranges as partitions in the reducer step?
syker
  • 10,912
  • 16
  • 56
  • 68
0
votes
1 answer

Use conditional criteria to change Row_Number like operation

Using SQL server 2012. I am using variables to identify the number of times "various criteria" are met within my overall dataset. And I want to take half of these instances and do one thing "first_half_thing" and with the other half do a…
0
votes
1 answer

Global indexes while renaming the partition name

I have a existing table with some indexes in it. I am going to do partitioning of that table using dbms redefinition. I also have to rename the partition names every 24 hours. Is there any problem in global indexes after I rename the partition…
user1947949
  • 43
  • 2
  • 8
0
votes
1 answer

Parallel bulk loading using partition switching of indexed table in SQL Server 2008

This is a follow up to a previous question of mine after definitely deciding on partition switching as the best way to quickly get data into a heavily indexed fact type table that needs to remain available to readers. While it seems to be the best…
0
votes
1 answer

trying to use partitioning instead of sub querys

I have two tables one in my current system that has the current email, and one that I have created and exported all the email addresses from the Global Address Book, along with a BIT value for which is the primary email address and a user id value.…
user802599
  • 787
  • 2
  • 12
  • 35
0
votes
2 answers

Having trouble groking multiple exclusive combinations in C++

I have an ordered string that I need to present to a user: ABCCDDCBBBCBBDDBCAAA Objects represented by 'B' are tagged, such that 2 Bs will have a '~' after them. AB~CCDDCB~BBCBBDDBCAAA AB~CCDDCBB~BCBBDDBCAAA AB~CCDDCBBB~CBBDDBCAAA and so…
notwithoutend
  • 235
  • 1
  • 9
0
votes
1 answer

efficient way to find unequal partitions of an integer

I have total partitions of an integer and I want only those partitions which have all the values unequal. For ex.-Partitions of 3 are {1,1,1,1},{2,2},{3,1},{1,1,2} and {4}. So, the required unequal partitions are {3,1} and {4} because they contain…
Sushant
  • 66
  • 1
  • 9
-1
votes
1 answer

Python semisort list of objects by attribute

I've got a list of an object: class packet(): def __init__(self, id, data): self.id, self.data = id, data my_list = [packet(1,"blah"),packet(2,"blah"),packet(1,"blah"),packet(3,"blah"),packet(4,"blah")] I want to extract all objects…
Jakob Lovern
  • 1,301
  • 7
  • 24
-1
votes
1 answer

How to groupby user_id and time using SQL Bigquery

I have a table that contains user_id, time (six hours interval), and average margin. I wanted to group by user_id and time (time in ascending order). The table looks like this as shown below: user_id time average_margin 5696 2020-10-12…
-1
votes
1 answer

Formatting an eMMC to SD format

I've been working with a Micron BGA eMMC chip and prototyping a communication scheme with the eMMC chip inside an adapter board that connects to the GPIO pins of a TI microcontroller. I've essentially created a communication scheme written in C code…
-1
votes
1 answer

How insert data from a temporary table into partitioned table in oracle/sql using merge statement

I have to write a merge statement to insert data from temporary table to a partitioned table and i'm getting below error:- Error report - SQL Error: ORA-14400: inserted partition key does not map to any partition I have to do it session wise and as…
-1
votes
1 answer

jq: groupby and nested json arrays

Let's say I have: [[1,2], [3,9], [4,2], [], []] I would like to know the scripts to get: The number of nested lists which are/are not non-empty. ie want to get: [3,2] The number of nested lists which contain or not contain number 3. ie want to…
Maths noob
  • 1,684
  • 20
  • 42
-1
votes
2 answers

Unable to create exactly equal data partitions using createDataPartition in R- getting 1396 and 1398 observations each but need 1397

I am quite familiar with R but never had this requirement where I need to create exactly equal data partition randomly using createDataPartition in R. index = createDataPartition(final_ts$SAR,p=0.5, list = F) final_test_data =…
Bharat Ram Ammu
  • 174
  • 2
  • 16
-1
votes
1 answer

Pyspark: Why show() or count() of a joined spark dataframe is so slow?

I have two large spark dataframe. I joined them by one common column as: df_joined = df1.join(df2.select("id",'label'), "id") I got the result, but when I want to work with df_joined, it's too slow. As I know, we need to repartition df1 and df2 to…
-1
votes
1 answer

STDEVP for calculated fields

I have a table that looks like this: ID CHANNEL VENDOR num_PERIOD SALES.A SALES.B 000001 Business Shop 1 40 30 000001 Business Shop 2 60 20 000001 Business Shop 3 NULL …
Also
  • 101
  • 1
  • 2
  • 6
1 2 3
22
23