I have a 200,000 file to be split into 8 chunks using powershell
The file has rows with the first value being the record 'KEY'
I would like to ensure that rows corresponding to the key field value (which is the first value of the row) do not break across files when the split happens.
Here is the simple split I use
$i=0
Get-Content -Encoding Default "C:\Test.csv" -ReadCount 10130 | ForEach-Object {
$i++
$_ | Out-File -Encoding Default "C:\Test_$i.csv"
}
Sample Data
0190709,HP16,B,B,3,3,
0190709,HP17,B,B,3,3,
0190709,HP18,B,B,3,3,
0196597,HP11,,CNN,,,
0196597,HP119,,CNN,,,
0196597,HP13,,CNN,,,
01919769,HP11,,ANN,,,
01919769,HP119,,OPN,,,
01919769,HP13,,CNN,,,
01919769,HP14,X,X,X,X,
01919769,HP15,A,A,X,X,
01919769,HP16,S,S,X,X,
01919769,HP17,S,S,5,5,
01919769,HP18,S,S,5,5,
0797819,HP14,X,AX,X,X,
0797819,HP15,X,XA,X,X,
0797819,HP16,X,X,XA,XA,
0797819,HP17,A,A,X,X,
0797819,HP18,A,A,AX,X,
Expected Output
Lets say we want 2 chunks of equal size. I would like 2 files like below with the key not split between files. Its ok if the file gets bigger (more lines) in an attempt to prevent page break of the key.
File 1
0190709,HP16,B,B,3,3,
0190709,HP17,B,B,3,3,
0190709,HP18,B,B,3,3,
0196597,HP11,,CaweNN,,,
0196597,HP119,,CNN,,,
0196597,HP13,,CNwN,,,
01919769,HP11,,AawNN,,,
01919769,HP119,,OePN,,,
01919769,HP13,,CNN,,,
01919769,HP14,XY,X,X,X,
01919769,HP15,A,A,XC,XA,
01919769,HP16,S,S,X,X,
01919769,HP17,S,S,5A,5,
01919769,HP18,S,S,5,5,
File 2
0797819,HP14,X,AX,X,X,
0797819,HP15,X,XA,X,X,
0797819,HP16,X,X,XA,XA,
0797819,HP17,A,A,X,X,
0797819,HP18,A,A,AX,X,