2

i'm new at powershell coding and what i'm trying to do is to import a CSV files with an unknown number of lines and then split it every 10 lines to a new CSV file.

Example if I input a CSV file of 97 lines, I expect 9 files of 10 lines and 1 file of 7 lines.

i tried the following, the Hostlist.csv contains 35 hostname, i'm getting 3 files, the first file missing the first host in the list and there is no 4th file with remaining host.

$Hostlist = Get-Content '.\Hostlist.csv'
$BatchID = Get-Random
New-Item -ItemType Directory -Force -Path .\$BatchID
$MaxBatch = 10
$line = 0
$i = 0
$File = 0

While ($line -lt $Hostlist.Length) {

    if ($i -gt $MaxBatch) {
        $Start = $line - $MaxBatch
        $File++
        $Hostlist[$Start..($line-1)] | Out-File ".\$BatchID\Batch$File.csv" -Append -force
        $i = 0
    }
    $i++;
    $line++
}
Outableh
  • 35
  • 3

2 Answers2

2

As an aside:

  • If your input file is a CSV file and you want your output files to be CSV files too, you'd have to write a header line to each.

  • Your code makes no attempt to do so, so I'm assuming you're simply dealing with line-oriented plain-text files, despite the .csv filename extension.

the first file missing the first host

Since your condition for processing a batch is $i -gt $MaxBatch, $i and $line are both 11 when you enter the if block first, and $Start = $line - $MaxBatch is therefore 1, i.e. the second line, given that $Start is used as a 0-based array index.

there is no 4th file with remaining host.

Since you're only processing a batch if $i -gt $MaxBatch, an input file whose line count that isn't evenly divisible by $MaxBatch will always be missing its last batch, because the last, incomplete batch then never satisfies the if condition.


I suggest simplifying your code by calculating the number of batches and looping batch by batch, as shown in the following simplified example, which partitions a 10-line input into batches of 3:

# Simulate the list of hosts
$HostList = 1..10 -replace '^', 'host$&' # 'host1', 'host2', ...

# Batch size
$MaxBatch = 3

foreach ($batch in 1..[math]::Ceiling($Hostlist.Count / $MaxBatch)) {
  write-verbose -Verbose "File index (batch number): $batch"
  $startNdx = ($batch-1) * $MaxBatch
  $Hostlist[$startNdx..($startNdx + $MaxBatch - 1)]
}

Note how use of [math]::Ceiling() on the batch-count calculation $Hostlist.Count / $MaxBatch ensures that an incomplete batch at the end is also processed.

Unless you have Set-StrictMode -Version 3 or higher in effect, it is fine to exceed the upper array bound in the .. range expression used to slice the array in a final, incomplete batch - PowerShell will simply ignore indices beyond the upper bound.

The above yields:

VERBOSE: File index (batch number): 1
host1
host2
host3
VERBOSE: File index (batch number): 2
host4
host5
host6
VERBOSE: File index (batch number): 3
host7
host8
host9
VERBOSE: File index (batch number): 4
host10
mklement0
  • 382,024
  • 64
  • 607
  • 775
0

your code won't make a proper CSV file since you are not handling the header info, nor using Import-CSV to let you deal with objects that naturally export as a CSV. so i went with some code written for another person.

$SourceDir = $env:TEMP
$SourceFile = 'Source.csv'
$FullSourceFile = Join-Path -Path $SourceDir -ChildPath $SourceFile

$BatchSize = 4

$OutputDir = $env:TEMP
$OutputFile = 'Output.csv'
$FullOutputFile = Join-Path -Path $OutputDir -ChildPath $OutputFile

#$InCSV = Import-Csv -Path $FullSourceFile

#region - fake reading in CSV
# fake reading in a CSV file
#    in real life, use Import-CSV above
$InCSV = @'
Col_1,Col_2,Col_3,Col_4
row-1-1, row-1-2, row-1-3, row-1-4
row-2-1, row-2-2, row-2-3, row-2-4
row-3-1, row-3-2, row-3-3, row-3-4
row-4-1, row-4-2, row-4-3, row-4-4
row-5-1, row-5-2, row-5-3, row-5-4
row-6-1, row-6-2, row-6-3, row-6-4
row-7-1, row-7-2, row-7-3, row-7-4
row-8-1, row-8-2, row-8-3, row-8-4
row-9-1, row-9-2, row-9-3, row-9-4
row-10-1, row-10-2, row-10-3, row-10-4
'@ | ConvertFrom-Csv
#endregion - fake reading in CSV

$Count = 0
$Remaining = $InCSV.Count
$Batch = [System.Collections.Generic.List[PSObject]]::new()
foreach ($IC_Item in $InCSV)
    {
    $Count ++
    $Ready = $False

    $Batch.Add($IC_Item)

    if ($Count -eq $BatchSize)
        {
        $Ready = $True
        $Count = 0
        }

    $Remaining --
    if ($Remaining -eq 0)
        {
        $Ready = $True
        }

    if ($Ready)
        {
        $TimeStamp = Get-Date -Format 'yyyy-MM-dd__HH-mm-ss__fff'
        $NewFullOutputFile = $FullOutputFile.Replace('.csv', "$TimeStamp.csv")

        $Batch | 
            Export-Csv -LiteralPath $NewFullOutputFile -NoTypeInformation

        $Batch.Clear()
        }
    }

there is no screen output. here is the content of the 1st CSV ...

"Col_1","Col_2","Col_3","Col_4"
"row-1-1","row-1-2","row-1-3","row-1-4"
"row-2-1","row-2-2","row-2-3","row-2-4"
"row-3-1","row-3-2","row-3-3","row-3-4"
"row-4-1","row-4-2","row-4-3","row-4-4"

the content of the last CSV ...

"Col_1","Col_2","Col_3","Col_4"
"row-9-1","row-9-2","row-9-3","row-9-4"
"row-10-1","row-10-2","row-10-3","row-10-4"
Lee_Dailey
  • 7,292
  • 2
  • 22
  • 26
  • 1
    @Outableh - you are most welcome! glad to have helped a tad ... and the answer by `mklement0` is truly the better one. [*grin*] – Lee_Dailey Oct 14 '19 at 01:03