0

I've a problem with charset encoding/decoding in my web application in rails 4. i see word like "rappresentò" or "è" in my webpage but i want to see "rappresentò" and "è".

this is my stack structure :

Server Os : Ubuntu 14.04, mysql database : "foo" , table :"bar"

table description : 

|description 
| mediumtext
| latin1_swedish_ci 
| YES |....

this is my database config in rails

default: &default    adapter: mysql2    encoding: latin1    pool: 5    username: root    password: ****    socket: ....

in rails view

meta-charset is "ISO-8859-1"

Note: - i reflected the problem only on the server machine - when i connect to the database via ssh i saw the characters correctly

This is my output in rails console:

Bar.find(38).description
Il volume è arricchito dalle illustrazioni di Jean-Jacques Sempé. "
  • You need to post the code the puts the characters into the page that you don't like. – 7stud Jan 29 '15 at 09:38
  • mmm..are you sure that the code is necessary? because I take the data from the controller and put them in the view... – user3541704 Jan 29 '15 at 09:44
  • Alright then, show me the console output of find() for one row where there is an "è". – 7stud Jan 29 '15 at 10:42
  • this is my console output. description field of foo.find(id) : " Claude Debussy (1862-1918) la cui musica rappresentò . Il volume è ". "è " is saved like "è" in database – user3541704 Jan 29 '15 at 10:53
  • No, not like that. I want you to copy and paste the find command and its output into your question, then highlight it, and click on `{ }` to put code tags around it. Secondly, I want you to post what encoding your browser is set to: in Chrome it's under View>Encoding. – 7stud Jan 29 '15 at 10:56
  • i pasted "find command", my chrome encoding is "auto-detect / utf-8" – user3541704 Jan 29 '15 at 11:25

1 Answers1

0

It looks to me like your terminal's encoding is ISO-8859-1, and for the accented e, the string contains the sequence \xC3\xA8. But in ISO-8859-1, you have this:

\xC3  --> Ã
\xA8  --> ¨

And, that is what you are seeing for the output in your ISO-8859-1 terminal.

In UTF-8, the sequence \xC3\xA8 happens to represent one character: è. So, that means you inserted the UTF-8 sequence \xC3\xA8 in your database for è--instead of the ISO-8859-1 sequence for è, which is \xE8.

UTF-8:       è  ->  \xC3\xA8
ISO-8859-1:  è  ->  \xE8

Because all characters in ISO-8859-1 are represented by one byte, and because the sequence \xC3\xA8 represents two bytes, that means any ISO-8859-1 device is going to interpret that sequence as the character \xC3, namely Ã, followed by the character \xA8, which is ¨.

Unfortunately, the strings you inserted in your database were encoded in UTF-8. Therefore, you should have specified the encoding for your database as UTF-8to begin with; or you should have converted your strings from UTF-8 to ISO-8859-1 before inserting them in your database:

2.0.0-p481 :029 > str = "R\xC3\xA8"
 => "Rè"    #Terminal set to 'ISO-8859-1' encoding

2.0.0-p481 :030 > str.encoding
 => #<Encoding:UTF-8> 

2.0.0-p481 :031 > str = str.encode('ISO-8859-1', 'UTF-8')
 => "R\xE8" 

2.0.0-p481 :032 > str.encoding
 => #<Encoding:ISO-8859-1> 

2.0.0-p481 :033 > puts str
Rè
 => nil 
7stud
  • 46,922
  • 14
  • 101
  • 127
  • My terminal's encoding is UTF-8. i don't understand because when i launch the find command in local machine rails show character correctly (my output is "è") while the same command via ssh is "è" – user3541704 Jan 29 '15 at 13:54
  • Then, apply all my comments to your remote host. Alternatively, change the encoding of your terminal to `ISO-8859-1`. The bottom line is that you inserted `UTF-8` data into your database, namely the sequence `\xC3\xA8` to represent an accented e; therefore any `ISO-8859-1` device will display that sequence as two junk characters. On the other hand, a `UTF-8` device will see the sequence `\xC3\xA8` as a single character: an accented e. – 7stud Jan 30 '15 at 07:51