- Im-/Export-Csv will inevitably double quote all columns.
- if the encoding is - and should stay - in UTF8, append
-Encoding UTF8
on import and export
- to remove diacritics only on specified columns, you'll have to iterate the rows and apply the
Remove-Diacritics
function on just those columns.
Given a sample users.csv:
"UserName","LastName"
"Test01ñ","Test01ñ"
"DúSibaagh01","DúSibaagh01"
"ËTheroË01Ë","ËTheroË01Ë"
"DMrçzundaljak01","DMrçzundaljak01"
"PçSchpaglawarz01ç","PçSchpaglawarz01ç"
This script:
## Q:\Test\2018\12\24\SO_53912246.ps1
function Remove-Diacritics {
param ([String]$src = [String]::Empty)
# Source: https://stackoverflow.com/a/7840951/6811411
$normalized = $src.Normalize( [Text.NormalizationForm]::FormD )
$sb = new-object Text.StringBuilder
$normalized.ToCharArray() | % {
if( [Globalization.CharUnicodeInfo]::GetUnicodeCategory($_) -ne
[Globalization.UnicodeCategory]::NonSpacingMark ) {
[void]$sb.Append($_)
}
}
$sb.ToString()
}
$CsvData = Import-csv .\Users.csv -Encoding UTF8
$CsvData | ForEach-Object {
$_.UserName = Remove-Diacritics $_.UserName
}
$CsvData
$CsvData | Export-Csv .\New_Users.csv -Encoding UTF8 -NoTypeInformation
will create this output:
UserName LastName
-------- --------
Test01n Test01ñ
DuSibaagh01 DúSibaagh01
ETheroE01E ËTheroË01Ë
DMrczundaljak01 DMrçzundaljak01
PcSchpaglawarz01c PçSchpaglawarz01ç