2

I have a text file with thousands of lines, containing both directory paths and file paths. I would like to loop through each line of the text file and remove any lines containing a directory path, and keep all lines containing a file path. An example of two lines (one directory, and one path from the text file):

exampleDirectoryPath/tags/10.0.0.8/tools/
exampleFilePath/tags/10.0.0.8/tools/hello.txt

So far, to loop through the text file, I have:

foreach ($line in [System.IO.File]::ReadLines("file.txt")) {
    if ($line -match ".*/.*$") {
        $line
    }
}

Goal output:

exampleFilePath/tags/10.0.0.8/tools/hello.txt

Note: I do not want to hardcode file extensions. There are thousands of files to traverse and I dont know what extensions are present, so I would like to return all of them.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563

3 Answers3

2

So, the basic logic here is easy:

Get-Content "file.txt" | where { $_ is a file path... }

It kind of depends on how you want to determine, if it's a file path

If all of your directory paths end in "/", you could simply do:

where { -not $_.EndsWith("/") }

or:

where { [system.io.Path]::GetFileName($_) -eq "" }

If not, but all of your file paths definitely have an extension, you could do:

where { [system.io.Path]::GetExtension($_) -ne "" }

If all of the paths actually exist, you could also do this:

where { Test-Path $_ -Type Leaf }
marsze
  • 15,079
  • 5
  • 45
  • 61
  • 1
    Thanks. I changed where { $_.EndsWith("/") } to where { ! $_.EndsWith("/") } - this ensured it was returning files, not folders. – Christian Townsend Oct 25 '21 at 23:00
  • While this is def a great, succinct answer, I just wanted to add that, depending on your use case, you should be careful about how much you rely on this method. See this discussion for a good primer on some of the nuances of this issue: https://stackoverflow.com/questions/980255/should-a-directory-path-variable-end-with-a-trailing-slash TL;DR - there is no way to to determine with 100% certainty whether something is or isn't a file purely from its path string, and in certain contexts, trailing slashes can mean different things. Be esp. careful if using this cross platform – diopside Oct 26 '21 at 02:46
1

To provide a concise solution that also performs well:

(Get-Content -ReadCount 0 file.txt) -notmatch '\\$'
  • Using -ReadCount 0 with Get-Content is a performance optimization that returns all lines in the input file as a single array object rather than collecting the lines one by one.

    • Additionally, -ReadCount 0 ensures that an array is output even if the input file happens to have just one line.
  • -notmatch, the negated form of the regex-based -match operator, acts as a filter with an array-valued LHS, returning the (non)matching elements (lines) (as a new array).

    • Regex \\$ matches a verbatim \ at the end ($) of each input string (line).

Note: As your question suggests, the solution above assumes that directories can be distinguished from files formally, based on whether the lines in the input file end in / or not.

mklement0
  • 382,024
  • 64
  • 607
  • 775
0

I personally would not use regex for this for the simple reason that, even though you may be able to validate if the path's pattern matches the pattern of a file or folder, it cannot validate if it actually exists. I would use this following your code:

$result = foreach($line in [System.IO.File]::ReadLines("file.txt"))
{
    if(([System.IO.DirectoryInfo]$line).Attributes -eq 'Archive')
    {
        $line
    }
}
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37