0

I am using gsutil combined with the "rsync" command to upload a business critical files to google storage as a backup. Unfortunately most of the archives and filenames are Greek for example "αντιγραφο.txt". On english files, rsync is ok, but when gsutil tries to sync greek files, it encounters an exception.

The command is:

gsutil -m rsync -d -r H:\Test gs://myserver.com/data

Building synchronization state... Caught non-retryable exception while listing file://H:\Test: CommandException: Invalid Unicode path encountered ('H:\Test\\xe1\xed\xf4\xe9\xe3\xf1\xe1\xf6\xef (1).txt'). gsutil cannot proceed with such files present. Please remove or rename this file and try again. NOTE: the path printed above replaces the problematic characters with a hex-encoded printable representation. For more details (including how to convert to a gsutil-compatible encoding) see gsutil help encoding. CommandException: Caught non-retryable exception - aborting rsync


I tried to convert the filenames to UTF-8 but I can't find anything that works on my windows cmd. I've searched many sites for iconv native2asciii but I can't locate something useful. The server is Windows 2012 so I cannot use "convmv" to convert filenames to UTF-8.Is there another way to convert all filenames to utf8 in an automated manner before I upload the to the cloud? The archive is 600GB so i can't just zip it and send it, i also want this to run automaticaly through task scheduler.

Thank you very much!

  • Hi, I am pretty sure the the path of the file is not to blame but its contents. I am not sure if that is what you meant. Thanks again! – Γιώργος Χατζηθανάσης Dec 09 '16 at 16:44
  • The Greek filenames are not proper Unicode, are they. αντιγραφο in hex would look like CE B1 CE BD CF 84 CE B9 CE B3 CF 81 CE B1 CF 86 CE BF. What you have here sounds like it's another codepage. So does rsync know which codepage it is? – Mr Lister Dec 09 '16 at 17:00
  • I am not sure what you mean, but the folder has only one file in. The "αντιγραφο.txt". So I am sure that gsutil complains about this file. I think the title encoding is windows-1253 but my only concern is how to make the filenames to utf-8 in order to proceed with the backup process. – Γιώργος Χατζηθανάσης Dec 09 '16 at 17:22
  • I am saying that if gsutil complains about the name not being proper Unicode, that means it thinks the name is in Unicode to begin with. So that's the problem. Does [this link](https://cloud.google.com/storage/docs/gsutil/addlhelp/Filenameencodingandinteroperabilityproblems) help? – Mr Lister Dec 09 '16 at 17:40
  • Unfortunately this is the first thing I came across when the problem started. – Γιώργος Χατζηθανάσης Dec 10 '16 at 14:45
  • I've used convmv from cp1552 to UTF-8. Still when I try to send the file, it fails again complaining about Unicode. – Γιώργος Χατζηθανάσης Dec 12 '16 at 13:31

0 Answers0