0

I am trying to download a file from S3 using s3cmd command-line. The file is having foreign characters like (楽曲満載アプリ!!最新曲から懐かしの曲、気になるあの曲も検索できる). But when I download this file in my ubuntu machine and open the file, using vi command, the characters are getting replaced as (??????). I am not sure why this issue is occurring. Any help/suggestions would be much appreciated. Thanks in advance.

  • Where are the characters getting replaced, using which application (and more importantly, which font?) – Pekka Mar 09 '16 at 11:32
  • 1
    The characters in the file are getting replaced. I downloaded the file in ubuntu using terminal. I am not sure about the font. But one finding which I saw is, I tried to download the file in my Mac machine and it is displaying the characters as expected. I also tried in a different ubuntu machine and even there the text is coming as expected. – Sathiya Narayanan Mar 09 '16 at 11:39
  • Check the mime type is set to "text/html; charset=utf-8" there's some examples of how to do this with the aws cli here: https://github.com/aws/aws-cli/issues/1346 - Also double check the file hasn't been broken when it was being uploaded to s3 – Will Mar 09 '16 at 11:56
  • What happens when you copy and paste some chinese text into the terminal on your ubuntu machine? Perhaps your terminal is set up in a way which doesn't support all character sets – Will Mar 09 '16 at 11:59
  • @Will As you suggested I tried pasting some chinese characters to ubuntu terminal. But the characters are not getting pasted. It is showing the below **Display all 2729 possibilities? (y or n)** I typed locale in my ubuntu machine and I get this output **LANG=en_US.UTF-8 LANGUAGE= LC_CTYPE=UTF-8 LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=** – Sathiya Narayanan Mar 09 '16 at 12:12

1 Answers1

1

I have finally resolved the issue. Posting here so that it can help someone in the future. Based on the hint given by Will about the settings of ubuntu terminal, I investigated on the locale of the machine. I saw the locale was en_US. To see the default locale, type locale in your terminal.

**P.S if the locale you want is not available then follow this link to install the locale

$ sudo locale-gen "en_IN"
Generating locales...
  en_IN... done
Generation complete.

$ sudo dpkg-reconfigure locales
Generating locales...
  en_IN... up-to-date
Generation complete.

** In my case the locale was something like

locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

I got the locale of the system where the file was opening properly

It was something like

LANG=en_IN
LANGUAGE=en_IN:en
LC_CTYPE="en_IN"
LC_NUMERIC="en_IN"
LC_TIME="en_IN"
LC_COLLATE="en_IN"
LC_MONETARY="en_IN"
LC_MESSAGES="en_IN"
LC_PAPER="en_IN"
LC_NAME="en_IN"
LC_ADDRESS="en_IN"
LC_TELEPHONE="en_IN"
LC_MEASUREMENT="en_IN"
LC_IDENTIFICATION="en_IN"
LC_ALL=

What I did was, I opened the locale file by using the command

sudo vi /etc/default/locale

and replaced the content of the file with

LANG=en_IN
LANGUAGE=en_IN:en
LC_CTYPE="en_IN"
LC_NUMERIC="en_IN"
LC_TIME="en_IN"
LC_COLLATE="en_IN"
LC_MONETARY="en_IN"
LC_MESSAGES="en_IN"
LC_PAPER="en_IN"
LC_NAME="en_IN"
LC_ADDRESS="en_IN"
LC_TELEPHONE="en_IN"
LC_MEASUREMENT="en_IN"
LC_IDENTIFICATION="en_IN"
LC_ALL=

After making this change I restarted the machine and now I open the file and baammmm. The chinese characters were showing up as expected. Thanks @Will for the hint and this link for making my day :)

Community
  • 1
  • 1