Remove Top Line of Text File with PowerShell

Question

I am trying to just remove the first line of about 5000 text files before importing them.

I am still very new to PowerShell so not sure what to search for or how to approach this. My current concept using pseudo-code:

set-content file (get-content unless line contains amount)

However, I can't seem to figure out how to do something like contains.

score 56 · Answer 1 · answered Feb 11 '12 at 01:15

While I really admire the answer from @hoge both for a very concise technique and a wrapper function to generalize it and I encourage upvotes for it, I am compelled to comment on the other two answers that use temp files (it gnaws at me like fingernails on a chalkboard!).

Assuming the file is not huge, you can force the pipeline to operate in discrete sections--thereby obviating the need for a temp file--with judicious use of parentheses:

(Get-Content $file | Select-Object -Skip 1) | Set-Content $file

... or in short form:

(gc $file | select -Skip 1) | sc $file

score 46 · Accepted Answer · edited May 29 '17 at 20:37

46

It is not the most efficient in the world, but this should work:

get-content $file |
    select -Skip 1 |
    set-content "$file-temp"
move "$file-temp" $file -Force

edited May 29 '17 at 20:37

Peter Mortensen

30,738
21
105
131

answered Jan 15 '10 at 19:53

Richard Berg

20,629
2
66
86

When I try to run this it seems that it errors out on the -skip. Could that maybe be from a different version? – Buddy Lindsey Jan 15 '10 at 20:41
2

-Skip is new to Select-Object in PowerShell 2.0. Also, if the files are all ascii then you might want to use set-content -enc ascii. If the encodings are mixed, then it gets trickier unless you don't care about the file encoding. – Keith Hill Jan 15 '10 at 20:51

score 13 · Answer 3 · edited May 29 '17 at 20:38

13

Using variable notation, you can do it without a temporary file:

${C:\file.txt} = ${C:\file.txt} | select -skip 1

function Remove-Topline ( [string[]]$path, [int]$skip=1 ) {
  if ( -not (Test-Path $path -PathType Leaf) ) {
    throw "invalid filename"
  }

  ls $path |
    % { iex "`${$($_.fullname)} = `${$($_.fullname)} | select -skip $skip" }
}

edited May 29 '17 at 20:38

Peter Mortensen

30,738
21
105
131

answered Jan 16 '10 at 07:30

hoge

203
1
3

score 9 · Answer 4 · edited May 29 '17 at 20:41

I just had to do the same task, and gc | select ... | sc took over 4 GB of RAM on my machine while reading a 1.6 GB file. It didn't finish for at least 20 minutes after reading the whole file in (as reported by Read Bytes in Process Explorer), at which point I had to kill it.

My solution was to use a more .NET approach: StreamReader + StreamWriter. See this answer for a great answer discussing the perf: In Powershell, what's the most efficient way to split a large text file by record type?

Below is my solution. Yes, it uses a temporary file, but in my case, it didn't matter (it was a freaking huge SQL table creation and insert statements file):

PS> (measure-command{
    $i = 0
    $ins = New-Object System.IO.StreamReader "in/file/pa.th"
    $outs = New-Object System.IO.StreamWriter "out/file/pa.th"
    while( !$ins.EndOfStream ) {
        $line = $ins.ReadLine();
        if( $i -ne 0 ) {
            $outs.WriteLine($line);
        }
        $i = $i+1;
    }
    $outs.Close();
    $ins.Close();
}).TotalSeconds

It returned:

188.1224443

IIRC this is because the parentheses around the gc|select means it reads the entire file into memory before piping it through. Otherwise the open stream causes set-content to fail. For big files I think your approach is probably best — Alex, Mar 15 '13 at 15:58
Thank you, @AASoft, for your great solution! I've allowed myself to improve it slightly by dropping the comparison operation in every loop speeding up the process by something like 25% - see [my answer](http://stackoverflow.com/a/24746158/177710) for details. — Oliver, Jul 14 '14 at 21:20

Oliver · Answer 5 · 2017-11-10T11:34:04.003

Inspired by AASoft's answer, I went out to improve it a bit more:

Avoid the loop variable $i and the comparison with 0 in every loop
Wrap the execution into a try..finally block to always close the files in use
Make the solution work for an arbitrary number of lines to remove from the beginning of the file
Use a variable $p to reference the current directory

These changes lead to the following code:

$p = (Get-Location).Path

(Measure-Command {
    # Number of lines to skip
    $skip = 1
    $ins = New-Object System.IO.StreamReader ($p + "\test.log")
    $outs = New-Object System.IO.StreamWriter ($p + "\test-1.log")
    try {
        # Skip the first N lines, but allow for fewer than N, as well
        for( $s = 1; $s -le $skip -and !$ins.EndOfStream; $s++ ) {
            $ins.ReadLine()
        }
        while( !$ins.EndOfStream ) {
            $outs.WriteLine( $ins.ReadLine() )
        }
    }
    finally {
        $outs.Close()
        $ins.Close()
    }
}).TotalSeconds

The first change brought the processing time for my 60 MB file down from 5.3s to 4s. The rest of the changes is more cosmetic.

You may want to add `-and !$ins.EndOfStream` to the `for` loop conditional to cover the cases where the file has fewer lines than `$skip`. — AASoft, Nov 10 '17 at 07:11

noam · Answer 6 · 2013-03-15T14:49:11.617

$x = get-content $file
$x[1..$x.count] | set-content $file

Just that much. Long boring explanation follows. Get-content returns an array. We can "index into" array variables, as demonstrated in this and other Scripting Guys posts.

For example, if we define an array variable like this,

$array = @("first item","second item","third item")

so $array returns

first item
second item
third item

then we can "index into" that array to retrieve only its 1st element

$array[0]

or only its 2nd

$array[1]

or a range of index values from the 2nd through the last.

$array[1..$array.count]

score 4 · Answer 7 · edited May 29 '17 at 20:42

4

I just learned from a website:

Get-ChildItem *.txt | ForEach-Object { (get-Content $_) | Where-Object {(1) -notcontains $_.ReadCount } | Set-Content -path $_ }

Or you can use the aliases to make it short, like:

gci *.txt | % { (gc $_) | ? { (1) -notcontains $_.ReadCount } | sc -path $_ }

edited May 29 '17 at 20:42

Peter Mortensen

30,738
21
105
131

answered Feb 14 '13 at 18:04

Luke Du

41
1

Thanks a lot for this solution. Could you indicate the website you mentioned? – giordano Aug 11 '16 at 15:31

score 2 · Answer 8 · answered May 15 '19 at 07:06

2

Another approach to remove the first line from file, using multiple assignment technique. Refer Link

 $firstLine, $restOfDocument = Get-Content -Path $filename 
 $modifiedContent = $restOfDocument 
 $modifiedContent | Out-String | Set-Content $filename

answered May 15 '19 at 07:06

Venkataraman R

12,181
2
31
58

score 1 · Answer 9 · edited Mar 06 '11 at 16:04

1

skip` didn't work, so my workaround is

$LinesCount = $(get-content $file).Count
get-content $file |
    select -Last $($LinesCount-1) | 
    set-content "$file-temp"
move "$file-temp" $file -Force

edited Mar 06 '11 at 16:04

Emperor XLII

13,014
11
65
75

answered Aug 20 '10 at 16:38

Mariusz Biesiekierski

11
1

score 1 · Answer 10 · answered Jun 13 '22 at 11:33

Following on from Michael Soren's answer.

If you want to edit all .txt files in the current directory and remove the first line from each.

Get-ChildItem (Get-Location).Path -Filter *.txt | 
Foreach-Object {
    (Get-Content $_.FullName | Select-Object -Skip 1) | Set-Content $_.FullName
}

score -3 · Answer 11 · answered Jul 09 '13 at 14:15

-3

For smaller files you could use this:

& C:\windows\system32\more +1 oldfile.csv > newfile.csv | out-null

... but it's not very effective at processing my example file of 16MB. It doesn't seem to terminate and release the lock on newfile.csv.

answered Jul 09 '13 at 14:15

danielmbarlow

65
5

Remove Top Line of Text File with PowerShell

11 Answers11

Linked