If you wish to replace all characters that are not letters or numbers, you may do the following:
$content = Get-Content test.txt
# Matches method will match all occurrences of special characters
if ($count = [regex]::Matches($content,'[^\p{L}\p{N}]').Count) {
Write-Output "This file contains $count special characters"
}
$UpdatedContent = Set-Content -Value ($content -replace '[^\p{L}\p{N}]',"`t") -Path test.txt -PassThru
Write-Output "File content without special characters"
$UpdatedContent
Explanation:
Since -replace
uses regex matching, you can set a matching pattern and a replacement string. [^]
is a character class that does not match (^
) anything inside. \p{L}
matches a unicode letter. \p{N}
matches a unicode number. Each special character is replaced with a PowerShell tab.
If you want consecutive special characters to be replaced by a single tab rather than one tab per character, you may use '[^\p{L}\p{N}]+'
in the replace expression only because we want each individual special character counted in the counting expression. The +
matches one or more of the previous matched character.
If there are expected to be non-English letters that you also want to replace, you may opt for '[^a-zA-Z0-9]'
as your regex match.