1

I am trying to pull substrings out of text files in bulk saving these substrings to an array. I have tried variations of the following. This outputs all of the selected strings to the screen but only saves the final output to the variable. Is there a way to mimic the functionality of a =+ operator in outvariable so that all items get stored in an array?

$FILES = ls "*.txt"
foreach($f in $FILES){
  $in=Get-Content $f
  $in | Foreach { Select-String -Path "$f" -Pattern "Ad ID" -outvariable 
  array1 }}

In the event that my strategy is misguided, the overall purpose of pulling substrings into an array is to have several arrays of separate substrings of these text files. Then I will concatenate the values into a csv. I'm attempting to pull elements out rather than re-arrange the text files as substrings within the text files are in different order. Example:

Txt File One:

Ad Id: xxxx
Ad Text: blah blah
Ad placement: spaceship

Txt File Two:

Ad Id: yyyy
Ad placement: zoo
Ad Text: blah blah

Final desired result (this part is working except for the order of the elements)

CSV file

xxxx, spaceship, blah blah
yyyy, zoo, blah blah
Merrill Cook
  • 988
  • 1
  • 8
  • 18

3 Answers3

2

Here is a way to build the array you are talking about. I do not think this is the best way to solve this problem. This does nothing about the order of results, nor does it create a .csv file.

$FILES = Get-ChildItem -File -Filter "*.txt"

$array1 = $()

foreach($f in $FILES) {
    Get-Content -Path $f |
        Select-String -Pattern "Ad Id.*" |
        ForEach-Object { $array1 += @($_.Matches.Value) }
}

$FILES.Count

$array1.Count
$array1
lit
  • 14,456
  • 10
  • 65
  • 119
  • Awesome -- this is exactly what I was trying to do. Totally right that order of results needs to be tackled as well. But this worked as expected for this step. Thank you! – Merrill Cook May 19 '18 at 20:39
  • Also, regarding the order of the results. This kept the order of the txt files in the directory. Substituting out "Ad Id.*" for other elements also kept the same order. So i'm assuming this will work for other fields and allow substrings to be re-paired back up in csv. – Merrill Cook May 19 '18 at 20:55
1

Try this one:

$files      = ls "*.txt"
$dictionary = @{}

foreach($f in $files) {
    $in = Get-Content $f
    $in.Split([Environment]::NewLine) | ForEach-Object {
        $key,$value = $_.Split(':')
        $dictionary[$key] = $value
    }
    $dictionary['Ad Id'] + ', ' + $dictionary['Ad placement'] + ', ' + $dictionary['Ad Text'] | Out-File -FilePath '.\results.csv' -Append
}

Sorted output:

$files      = ls "fil*.txt"
$dictionary = @{}
[System.Collections.Generic.List[String]]$list = @()

foreach($f in $files) {
    $in = Get-Content $f
    $in.Split([Environment]::NewLine) | ForEach-Object {
        $key,$value = $_.Split(':')
        $dictionary[$key] = $value
    }
    [void]$list.Add( $dictionary['Ad Id'] + ', ' + $dictionary['Ad placement'] + ', ' + $dictionary['Ad Text'] )
}
[void]$list.Sort()
$list | Out-File -FilePath '.\results.csv' -Append
f6a4
  • 1,684
  • 1
  • 10
  • 13
  • This is a very helpful response. I see why I would want to parse the txt this way. After pushing the txt files through the above results.csv comes back as an empty csv (just commas). There are other fields split by colons that I did not mention above for simplicity. Do you know if I need to add them all as keys in $dictionary for this approach to work? And thank you for the thorough answer! – Merrill Cook May 19 '18 at 20:50
  • Put another way, if there are more colons triggering split than there are available keys in the dictionary list, could that be what is messing with the output? – Merrill Cook May 19 '18 at 20:52
  • I'm not sure if I get you right. Could you give an example what you mean in detail? (Input/expected Output). – f6a4 May 19 '18 at 20:55
  • So the txt files in $files actually have a number of fields followed by colons. More so than are added to the dictionary (or than were in my original example). Examples: Age:, Language:, Ad Creation Date:, etc. I i'm to see if it works adding all fields with colons to the line [void]$list.Add.. I think the existence of more colons than Ad Id, Ad placement, and Ad Text might be messing with the output? – Merrill Cook May 19 '18 at 20:59
  • Nevermind -- got it to work. My example fields used above had slight differences from the actual field name. Thank you! – Merrill Cook May 19 '18 at 21:03
  • No problem. If you want to add other fields, e. g. 'Ad AnyThingElse: 12345' just add the respective dictionary index to the line '[void]$list.Add(....) – f6a4 May 19 '18 at 21:09
1

Another slightly different approach.

  • A RegEx parses $Line and creates a variable with the name before the colon (without Ad) and value what is behind
  • After each processed file the vars are output as a custom object

$Data = ForEach ($File in (Get-ChildItem File*.txt)){
    $Id,$Text,$Placement="","",""
    ForEach ($Line in (Get-Content $File)){
        If ($Line -Match "AD (?<Label>.*?): (?<Value>.*)"){
            Set-Variable -Name "$($Matches.Label)" -Value $Matches.Value
        }
    }
    [PSCustomObject]@{ID        = $Id
                      Placement = $placement
                      Text      = $Text}
}
$Data
$Data | Export-CSv ".\Result.csv" -NoTypeInformation

Sample output:

ID   Placement Text
--   --------- ----
xxxx spaceship blah blah
yyyy zoo       blah blah