-1

I'm working with a bunch of Excel spreadsheets that have anywhere from 3 to 14 columns:

  • Some really only have 3 columns
  • Some have a number of hidden columns (e.g.: 5 are hidden 9 are visible)
  • The rest are complete (all the data, no hidden columns etc.) with up to 14 columns

Fortunately, in all cases, I only need a subset of that data and they all contain what I need. My initial thought was to read the Excel document and return the contents as an object but I'm having a difficult time conceptualizing a process that will work for Excel documents with varying columns.

The code below shows both examples while in the real world I would only use one.

Function Get-ExcelContent
    {
        [CmdletBinding()]
        Param
            (
            [Parameter(Mandatory=$false)]
            [string]$ExcelFile,
            [Parameter(Mandatory=$false)]
            [string]$SheetName,
            [alias('Visible')]
            [switch]$VisibleFieldsOnly
        )


        If(!(Test-Path -Path $ExcelFile -PathType Leaf))
            {
                write-host "Unable to find excel file: $ExcelFile"
                break
            }

        # for 64-bit os'
        $strProvider = "Provider=Microsoft.ACE.OLEDB.12.0"
        # otherwise 32-bit os
        #$strProvider = "Provider=Microsoft.Jet.OLEDB.4.0"

        $strDataSource = "Data Source = $ExcelFile"
        $strExtend = "Extended Properties=Excel 8.0"
        $strQuery = "Select * from [$SheetName]"
        $objConn = New-Object System.Data.OleDb.OleDbConnection("$strProvider;$strDataSource;$strExtend")
        $sqlCommand = New-Object System.Data.OleDb.OleDbCommand($strQuery)
        $sqlCommand.Connection = $objConn
        $objConn.open()
        $DataReader = $sqlCommand.ExecuteReader()

        If($VisibleFieldsOnly)
            {
                # Get count of the non-hidden fields
                $CountOfColumns = $DataReader.VisibleFieldCount
            }
        Else
            {
                # Get all the fields
                $CountOfColumns = $DataReader.FieldCount
            }

        $ColumnCounter = 0
        $pscoExcelData = @()

        While($DataReader.read())
            {
                ###########################################################
                # IF I KNOW THE COUNT OF COLUMNS AHEAD OF TIME, THIS WORKS
                ###########################################################
                $pscoExcelData += [pscustomobject][ordered] @{
                    $DataReader.GetName(0) = $DataReader[0].Tostring()
                    $DataReader.GetName(1) = $DataReader[1].Tostring()
                    $DataReader.GetName(2) = $DataReader[2].Tostring()
                    $DataReader.GetName(3) = $DataReader[3].Tostring()
                    $DataReader.GetName(4) = $DataReader[4].Tostring()
                    $DataReader.GetName(5) = $DataReader[5].Tostring()
                    $DataReader.GetName(6) = $DataReader[6].Tostring()
                    $DataReader.GetName(7) = $DataReader[7].Tostring()
                    $DataReader.GetName(8) = $DataReader[8].Tostring()
                }
                ###########################################################
                # BUT HOW DO I DO IT WHEN THE COLUMN COUNT VARIES?
                ###########################################################
                for($i=0; $i -le $CountOfColumns-1;$i++)
                {
                      $pscoExcelData += [pscustomobject][ordered] @{ $DataReader.GetName($i) = $DataReader[$i].Tostring() }
                }
            }

        $dataReader.close()
        $objConn.close()

        $pscoExcelData
    }

Get-ExcelData 'C:\path\to\Book1.xlsx' 'Sheet1$' -VisibleFieldsOnly
JuliusPIV
  • 145
  • 1
  • 3
  • 11

1 Answers1

0

You just have to create the hash table in the loop, and then convert that to a PowerShell object at the end. Something like the following (not tested):

$rec = @{}
for($i=0; $i -lt $CountOfColumns; $i++)
{
    $rec[$DataReader.GetName($i)] = $DataReader[$i].Tostring()
}
$pscoExcelData += [pscustomobject][ordered]$rec

Edit: Another possibility that should preserve the column order (also not tested):

$rec = New-Object pscustomobject
for($i=0; $i -lt $CountOfColumns; $i++)
{
    $rec | Add-Member -Name $DataReader.GetName($i) -Value $DataReader[$i].Tostring()
}
$pscoExcelData += $rec

(By the way, it is conventional to use -lt rather than -le in for loops.)

dan-gph
  • 16,301
  • 12
  • 61
  • 79
  • Interesting and I think you nailed it. The only change was not being able to use the [ordered] attribute on the variable otherwise it barks at you: `The ordered attribute can be specified only on a hash literal node` This is where I ended up: `$hashRecord = [ordered]@{} for($i=0; $i -le $CountOfColumns-1;$i++) { $hashRecord[$DataReader.GetName($i)] = $DataReader[$i].Tostring() } $pscoExcelData += [pscustomobject]$hashRecord` – JuliusPIV Sep 22 '15 at 01:33
  • @Phylum, in that case, you might want to do it a different way if you need to preserve the column order. Please see the edit above. – dan-gph Sep 22 '15 at 01:56