Questions tagged [partition]

Use this tag for questions about code that partitions data, memory, virtual machines, databases or disks.

In computing, partition may refer to

  • Disk partitioning, the division of a hard disk drive
  • Partition (database), the division of a database
  • Logical partition (virtual computing platform) (LPAR), a subset of a computer's resources, virtualized as a separate computer
  • Memory partition, a subdivision of a computer's memory, usually for use by a single job
  • Binary space partitioning

source: https://en.wikipedia.org/wiki/Partition

Note that non-programming questions about database partitioning are likely to be better received on Database Administrators and disk partitioning on Server Fault.

1547 questions
12
votes
4 answers

how to make an image of android partition to your pc

I am trying to make a backup (a direct dd image of the partitions of my built-in memory card of my phone to my PC. I am using Linux and my phone is a Nexus 4.
hongo
  • 687
  • 1
  • 6
  • 10
11
votes
1 answer

spark write to disk with N files less than N partitions

Can we write data to say 100 files, with 10 partitions in each file? I know we can use repartition or coalesce to reduce number of partition. But I have seen some hadoop generated avro data with much more partitions than number of files.
Kenny
  • 355
  • 1
  • 5
  • 14
11
votes
2 answers

Hive doesn't read partitioned parquet files generated by Spark

I'm having a problem to read partitioned parquet files generated by Spark in Hive. I'm able to create the external table in hive but when I try to select a few lines, hive returns only an "OK" message with no rows. I'm able to read the partitioned…
ALunz
  • 311
  • 2
  • 8
10
votes
5 answers

Move docker volume to different partition

I have a server where I run some containers with volumes. All my volumes are in /var/lib/docker/volumes/ because docker is managing it. I use docker-compose to start my containers. Recently, I tried to stop one of my container but it was impossible…
fmdaboville
  • 1,394
  • 3
  • 15
  • 28
10
votes
2 answers

Difference between partition and index in hive

I am new in hadoop and hive and I would know what is the difference between index and partition in hive? When I use index and when partition? Thank you!
sonia
  • 167
  • 2
  • 2
  • 10
10
votes
6 answers

Data not getting loaded into Partitioned Table in Hive

I am trying to create partition for my Table inorder to update a value. This is my sample data 1,Anne,Admin,50000,A 2,Gokul,Admin,50000,B 3,Janet,Sales,60000,A I want to update Janet's Department to B. So for doing that I created a table with…
USB
  • 6,019
  • 15
  • 62
  • 93
9
votes
2 answers

How to check the number of partitions of a Spark DataFrame without incurring the cost of .rdd

There are a number of questions about how to obtain the number of partitions of a n RDD and or a DataFrame : the answers invariably are: rdd.getNumPartitions or df.rdd.getNumPartitions Unfortunately that is an expensive operation on a DataFrame…
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
9
votes
4 answers

How to pick up all data into hive from subdirectories

I have data organized in directories in a particular format (shown below) and want to add these to hive table. I want to add all data of 2012 directory. All below names are directory names, and the inner most dir (3rd level) has the actual data…
Yash Sharma
  • 1,674
  • 2
  • 16
  • 23
9
votes
3 answers

SQL Concatenate multiple rows

I'm using Teradata, I have a table like this ID String 123 Jim 123 John 123 Jane 321 Jill 321 Janine 321 Johan I want to query the table so I get ID String 123 Jim, John, Jane 321 Jill, Janine,…
user2888246
  • 175
  • 1
  • 4
  • 9
8
votes
1 answer

What is partitioner parameter in Tensorflow variable_scope used for?

tf.variable_scope has a partitioner parameter as mentioned in documentation. As I understand it's used for distributed training. Can anyone explain it in more details what is the correct use of it?
Anas Bari
  • 171
  • 1
  • 2
  • 8
8
votes
3 answers

How kafka balances partitions load?

i faced a question with load balancing in kafka. So, i created a topic with 10 partitions and created 2 consumers. The 10 partitions were divided and assigned to these consumers (5 partitions to the first and 5 to the second) and it works fine.…
8
votes
4 answers

Python partition string with regular expressions

I am trying to clean text strings using Python's partition and regular expressions. For example: testString = 'Tre Bröders Väg 6 2tr' sep = '[0-9]tr' head,sep,tail = testString.partition(sep) head >>>'Tre Br\xc3\xb6ders V\xc3\xa4g 6 2tr' The head…
seb
  • 2,251
  • 9
  • 30
  • 44
8
votes
2 answers

Creating a partitioned hive table from a non partitioned table

I have a Hive table which was created by joining data from multiple tables. The data for this resides in a folder which has multiple files ("0001_1" , "0001_2", ... and so on). I need to create a partitioned table based on a date field in this table…
veemo
  • 311
  • 1
  • 2
  • 10
8
votes
1 answer

How to filter on ROW_NUMBER()

I am trying to select distinct NAME from a dataset but also return other columns. I have it working to a degree but just cant figure out how to bring it together. I suspect I need a WITH x( or something but am unsure Here is the CODE and an image…
Orin Moyer
  • 509
  • 2
  • 7
  • 13
8
votes
2 answers

How to sample/partition panel data by individuals( preferably with caret library)?

I would like to partition panel data and preserve the panel nature of the data: library(caret) library(mlbench) #example panel data where id is the persons identifier over years data <-…
Googme
  • 914
  • 7
  • 27