14

I'm on Rails 4 and Ruby 1.9.3

I use "strange" characters very often, so I have to declare UTF-8 encoding at the top of all .rb files.

Is there any way to set UTF-8 as the default encoding for Ruby 1.9.3?


I tried all answers, but when running rake db:seed and creating an object whose attributes contain non US-ASCII valid characters, I still receive this error:

`block in trace_on': invalid byte sequence in US-ASCII (ArgumentError)
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Fellow Stranger
  • 32,129
  • 35
  • 168
  • 232
  • declaring default codepage at the beginning of each file as `utf-8` is need, when you use unicode char directly in the same .rb file. Which problem leaded to your question? 'UTF-8' cp is set in `ruby 1.9.x` by default. Do your have a string with non-utf codepage? – Малъ Скрылевъ Dec 11 '13 at 14:21
  • 1
    "'UTF-8' cp is set in ruby 1.9.x by default." this not true – Roman Kiselenko Dec 11 '13 at 14:34

4 Answers4

20

To change the source encoding (i.e. the encoding your actual written source code is in), you have to use the magic comment currently:

# encoding: utf-8

It is not enough to either set the internal encoding (the encoding of the internal string representation after conversion) or the external encoding (the assumed encoding of read files). You actually have to set the magic encoding comment on top of files to set the source encoding.

In ChiliProject we have a rake task which sets the correct encoding header in all files automatically before a release.

As for encoding defaults:

  • Ruby 1.8 and below didn't knew the concept of string encodings at all. Strings were more or less byte arrays.
  • Ruby 1.9: default string encoding is US_ASCII everywhere.
  • Ruby 2.0 and above: default string encoding is UTF-8.

Thus, if you use Ruby 2.0, you could skip the encoding comment and correctly assume UTF-8 encoding everywhere by default.

Holger Just
  • 52,918
  • 14
  • 115
  • 123
12

I think you would want one of the following, depending on the context.

Encoding.default_internal = Encoding::UTF_8
Encoding.default_external = Encoding::UTF_8

This setting is made in the environment.rb file.

atw
  • 5,428
  • 10
  • 39
  • 63
Sean Larkin
  • 6,290
  • 1
  • 28
  • 43
  • 2
    This only defined the internal encoding (the internal string representation after conversion) and external encoding (the default encoding of read files), but not the encoding of ruby source files. This can only be changed with magic comments on top of a source file. – Holger Just Dec 11 '13 at 14:21
  • I had to use this in a dockerized environment, where it defaulted to US-ASCII. Thank you. – Dalibor Filus Nov 19 '18 at 12:15
6

in Ruby 1.9 the default is ASCII

in Ruby 2.0 the default is UTF-8.


change Ruby version

or

config.encoding = "utf-8" # application.rb

and in your database.yml

development:
     adapter:  your_db
     host:     localhost
     encoding: utf8
Roman Kiselenko
  • 43,210
  • 9
  • 91
  • 103
2

In your application.rb

# Configure the default encoding used in templates for Ruby
config.encoding = "utf-8"

This is not the whole story as pointed out by Holger, check out this question for further explanation.

Community
  • 1
  • 1
davegson
  • 8,205
  • 4
  • 51
  • 71
  • This only defined the internal encoding (the internal string representation after conversion) and external encoding (the default encoding of read files), but not the encoding of ruby source files. This can only be changed with magic comments on top of a source file. – Holger Just Dec 11 '13 at 14:21
  • That answer is the same as the answer someone else already said – Sapphire_Brick Sep 03 '19 at 16:06
  • how so? I feel my answer is not the best, but adds additional value that the other answers do not mention (the link and the discussion there) – davegson Sep 03 '19 at 18:13