4

What I am trying to do is to create a word document from the text file. But the text file has pageBreaks in it. I want to remove or replace those pageBreaks in the text file. This is to allow me to add pageBreaks in the word document that I'll subsequently create in places where I actually need it.

Below is the PowerShell code that I tried myself to replace the pageBreak in the text file. This doesn't work. As using "`f" in place of pageBreak doesn't work.

$oldWord = "`fPage"
$newWord = "Page"

#-- replace the page breaks in the file
(Get-Content $inputFilePath) -replace '$oldWord', '$newWord' | Set-Content $inputFilePath

The symbol shown for pageBreak in the text editor UltraEdit is ♀ Replacing the character in UltraEdit is easy. I want to replace or remove this using Powershell.

Below is a related question. But still unanswered with regards to PowerShell code.

How to remove unknown line break (special character) in text file?

Community
  • 1
  • 1
RayBacker
  • 166
  • 1
  • 2
  • 12
  • What I am trying to do is to create a word document from the text file. But the text file has pageBreaks in it. I want to remove or replace those pageBreaks in the text file. This is to allow me to add pageBreaks in the word document that I'll subsequently create – RayBacker Mar 05 '17 at 04:52
  • @RyanBemrose I tried using -Raw. Below is the error. Get-Content : A parameter cannot be found that matches parameter name 'Raw'. – RayBacker Mar 05 '17 at 05:17
  • I guess Windows 7? I think the `-Raw` param was added in 8.0 or 8.1. Anyway, if the posted answers don't work for you, we'll probably need to know exactly what the character is. Try opening the file in a hex editor to see the end of page character sequence. – Ryan Bemrose Mar 05 '17 at 05:20
  • One reason your code was not working was because you were using variables in single quotes not double quotes. – Mark Wragg Mar 05 '17 at 08:06
  • @RyanBemrose The raw parameter works from powershell v3 (from abhijith-pk answer). I'm having powershell v2 as part of Windows 7. – RayBacker Mar 06 '17 at 02:47

2 Answers2

3

for page breaks , you can use :

[io.file]::ReadAllText( 'H:\oldFile.txt') | %{$_.replace("`f","")} >h:\newFile.txt

below snippet will work from powershell v3:

cat H:\oldFile.txt -raw | %{$_.replace("`f","")} >h:\newFile.txt
ClumsyPuffin
  • 3,909
  • 1
  • 17
  • 17
2

Thanks for the question! This one was interesting.

So the Form-Feed special character is a bugger in powershell. If you echo it out, you just get an odd character, or a square if you cannot display it. But if you copy and paste it back into the powershell terminal, if just moved your command entry point to the top of the screen. Odd.

What I did was try to find ways of replacing general special characters. You can use regexes in powershell using $oldWord -replace 'REGEX_GOES_HERE', 'THING_TO_REPLACE_WITH_HERE, so what I came up with is this:

$oldWord -replace '[\f]', '' #You can also use \r for carriage return, \n for new line, \t for tab, \s for ALL whitespace

This will simply remove all instances of the Form-Feed character.

Hope this helps! Cheers!

B. Witter
  • 564
  • 6
  • 19
  • tried using $oldWord = "[\f']Page" $newWord = "Page". but this didn't work. I am not used to REGEX. – RayBacker Mar 05 '17 at 06:02
  • Sorry, I made a minor typo in the code I sent. If your string looks like "`fHere is some words with that weird symbol `f in it" and saved to a variable `$oldVariable` then you can get the "fixed" version by doing `$oldVariable -replace "[\f]", ""` which will find all instances of `f and replace it with an empty string. The result would be "Here is some words with that weird symbol in it" – B. Witter Mar 11 '17 at 15:52