0

I am looking to sanity check a flat csv file from Workday prior to import into MIM. The file is dumped in a path on the server every 90 minutes. Before importing the file into MIM I would like to compare the most recent download to the last import. If a line is missing on the DifferenceObject that appeared on the ReferenceObject I want to create a custom object and add each to an array. Currently I find I am capturing any changes made (duh i'm doing a diff) but what I am looking to grab is just when a line is missing completely. When a line is missing completely I am taking the InputObject and using the .split method at each position of my comma delimiter. The key here is I want to observe the account name and status. If a user was Active on the ReferenceObject and does not exist on the DifferenceObject I will then create a synthetic record of the user in a Terminated state to append to the DifferenceObject prior to MIM importing the file. On the following round the object would be missing again but since it was in a terminated state a synthetic record will not be added. This will protect deletion of Active object if a user is filtered out of the Workday file, as well rescinded hires who are deleted in an Active state will now flow through the system as terminated prior to the objects deletion in effort to accomodate Service Now status.

$diff = Compare-Object -ReferenceObject (Get-Content -Path results.20220421_1532.csv) -DifferenceObject (Get-Content -Path results.20220421_1705.csv)
$datarr = @()
$diff | Where-Object { $_.sideIndicator -eq "<=" } |  ForEach-Object {
    $obj = [PSCustomObject]@{
        SamAccountName = $_.InputObject.Split(",")[0]
        AccountStatus1 = $_.InputObject.Split(",")[8]
        AccountStatus2  = $_.InputObject.Split(",")[9]
    }
    if ($obj.AccountStatus2 -like "Active") {
        $datarr += $obj    
    }else {}
}

Currently you can see I am adding users to $datarr when there is a change and they are active in the referenceObject. I guess essentially I am looking to only add to $datarr when in referenceObject they are Active and in DifferenceObject there line is $null but there are no $null lines and the line count length is less when an object is filtered out of the Workday report.

mklement0
  • 382,024
  • 64
  • 607
  • 775
LEnc95
  • 58
  • 2
  • 10
  • 3
    Are you able to simplify your "question"? For me personally, it's overly complicated to read and understand what it is you're trying to ask (I don't actually see a question in your post). – codaamok Apr 22 '22 at 15:50
  • Yep, I want to observe two files and where a line exists in one file but does not on the other I want to add the missing line to an array. I don't want to capture all differences between files but only the missing lines. – LEnc95 Apr 22 '22 at 15:58
  • @codaamok pretty clear from description from the title... – LEnc95 Apr 22 '22 at 16:04
  • 2
    It might make sense to you because your head is in "the zone" on the issue, but from an outside perspective, I can confidently tell you it is *not* clear. You should aim for short and concise questions, this will give you better chances of receiving accurate and quality answers. For instance, we don't need a written walk-through of your code - we can read your code. – codaamok Apr 22 '22 at 16:11
  • @codaamok Gotcha will keep it short moving forward. So how do you diff two files and target missing lines. – LEnc95 Apr 22 '22 at 16:41
  • So in short, you want an array containing only the missing lines from `results.20220421_1705.csv`? – codaamok Apr 22 '22 at 16:46

1 Answers1

3
  • You need to correlate the two CSV files by their identifying column values, which I presume to be in the SamAccountName column.

    • To that end, it's better (albeit slower) to use Import-Csv to parse the CSVs into objects instead of trying plain-text processing via Get-Content
  • You can then correlate the objects via their SamAccountName property values, which Compare-Object allows you to do by passing the property name to its -Property parameter.

    • Adding the -PassThru switch ensures that the full input objects (always from the -ReferenceObject collection) are being passed through (decorated with a SideIndicator property)

Therefore, I presume you're looking for something like the following:

$datarr = 
  Compare-Object -Property SamAccountName -PassThru `
    -ReferenceObject  (Import-Csv results.20220421_1532.csv) `
    -DifferenceObject (Import-Csv results.20220421_1705.csv) | 
      Where-Object SideIndicator -EQ '<=' | 
        Where-Object AccountStatus2 -eq Active |
          Select-Object SamAccountName, AccountStatus1, AccountStatus2

Note that I'm assuming that:

  • both CSVs have a SamAccountName column or at least share a column containing SAM account names, even if its name differs - adjust accordingly.

  • (at least) the -ReferenceObject CSV has AccountStatus1 and AccountStatus2 columns; if you need to rename those properties, you can use calculated properties.

If your CSV files lack headers, supply the desired column names as an array to Import-Csv's
-Header parameter; e.g.:

$colNames = 'SamAccountName', 'foo', 'bar', 'AccountStatus1', 'AccountStatus2'
Import-Csv -Header $colNames results.20220421_1532.csv

Note that you may pass fewer names than there are columns in the CSV, if the remaining columns aren't of interest, but names must be specified for every column up to the last one of interest.

mklement0
  • 382,024
  • 64
  • 607
  • 775