3

I have a lot of ANSI text files that vary in size (from a few KB up to 1GB+) that I need to convert to Unicode.

At the moment, this has been done by loading the files into Notepad and then doing "Save As..." and selecting Unicode as the Encoding. Obviously this is very time consuming!

I'm looking for a way to convert all the files in one hit (in Windows). The files are in a directory structure so it would need to be able to traverse the full folder structure and convert all the files within it.

I've tried a few options but so far nothing has really ticked all the boxes:

  • ansi2unicode command line utility. This has been the closest to what I'm after as it processes files recursively in a folder structure...but it keeps crashing whilst running before it's finished converting.
  • CpConverter GUI utility. Works OK to a point but struggles with multiple files in a folder structure - only seems to be able to handle files in one folder
  • There's a DOS command that works OK on smaller files but doesn't seem to be able to cope with large files.
  • Tried GnuWin sed utility but it crashes every time I try and install it

So I'm still looking! If anyone has any recommendations I'd be really grateful

Thanks...

user2724502
  • 181
  • 2
  • 3
  • 13

1 Answers1

4

OK, so in case anyone else is interested, I found a way to do this using PowerShell:

Get-ChildItem "c:\some path\" -Filter *.csv -recurse | 
    Foreach-Object {
    Write-Host (Get-Date).ToString() $_.FullName
    Get-Content $_.FullName | Set-Content -Encoding unicode ($_.FullName  + '_unicode.csv')
}

This recurses through the entire folder structure and converts all CSV files to Unicode; the converted files are written to the same locations as the originals but with "unicode" appended to the filename. You can change the value of the -Encoding parameter if you want to convert to something different (e.g. utf-8).

It also outputs a list of all the files converted along with a timestamp against each

user2724502
  • 181
  • 2
  • 3
  • 13