0

I'm working with a legacy database in Oracle where the database encode is set to "UTF-8", but the tables can be in any encoding for example: "Windows-1252", "Latin1" etc, due to keep compatibility with legacy software. For now we are migrating those systems to Rails but meanwhile they have to coexist simultaneously. In my specific case the tables that i'm working with are in "Windows-1252".

This is how my database.yml is set up:

default: &default
  adapter: oracle_enhanced
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
  username: username
  password: password

development:
  <<: *default
  database: development_database
  encoding: utf8

The problem is that whenever I receive an ActiveRecord Relation object and I want to show it I have to manually loop through it and convert "Windows-1252" to "UTF-8" encoding using the encode Ruby method.
For example let's suppose a table User with only two attributes id and username, the related code would look like this in the view:

<tbody>
  <% @users.each do |user| %>
    <tr>
      <td><%= user.id.encode('UTF-8', 'Windows-1252', invalid: :replace) %></td>
      <td><%= user.username.encode('UTF-8', 'Windows-1252', invalid: :replace)%></td>
    </tr>
  <% end %>
</tbody>

This is not the only problem, when I need to generate JSON from the ActiveRecord Relation object I get this error JSON :: GeneratorError: source sequence is illegal / malformed utf-8, and many others depending on what I want to do.

The question is: is there a way to set the enconding of a table right in the model, in the given example: user.rb, and make rails automatically convert the original encoding regardless of which it is to another like "UTF8", in a way kind of the used in database.yml, only this time the configuration is not for the whole database but in a per table way?

This certainly would clear my code a lot and would help a lot in the development process.

1 Answers1

1

Make a copy of your database

Edit your migration where you create the table in question to read:

create_table :users, options: "ENGINE=InnoDB DEFAULT CHARSET=utf8" do |t|

Reimport your database

EDIT:

If your database wasn't formed through rails migrations (as you admittedly implied before), you can set encoding explicitly in ruby files:

How does the magic comment ( # Encoding: utf-8 ) in ruby​​ work?

More info here:

http://graysoftinc.com/character-encodings/ruby-19s-three-default-encodings

You'll admittedly need to do this on every page, but it's better than having to do it on every object on every page. I believe there might be a system wide setting for it if you can find it in the docs.

Mark
  • 6,112
  • 4
  • 21
  • 46
  • 1
    It's a legacy database, it was not generated by rails migration and it's a corporate database that has over 36 million registers, copying them is not viable. Also has the problem that other systems use it and they are coded to worked with the data stored in "Windows-1252", change the database encoding is not viable either. – Derick Nunes Oct 12 '18 at 02:54
  • My bad - should have read the question more clearly. You can specify which encoding a ruby file uses individually using magic comments - link in my answer – Mark Oct 12 '18 at 10:00