7

I'm using Rails 3.0.3 with REE ( Ruby 1.8.7 ) and gem 'mysql2', '0.2.6'

There's a search feature in my project that enable people to use the GET method using URL or using forms and then generate the URL.

Example:

I want to search:

origin city: "Århus, Denmark" and destination city: "Asunción, Paraguay"

they both have a special character: "Å" and "ó", so the URL will be generated like this when someone click the search button.

?&origin=%C5rhus%2C%20Denmark&destination=Asunci%F3n%2C%20Paraguay

Problem:

When i search that city, it's not unescaped like i want ( i tried using like CGI, URI, even some gems).

When i see at the console, ActiveRecord received the query like this:

Parameters: {"destination"=>"Asunci�n, Paraguay", "origin"=>"�rhus, Denmark", "sort"=>"newest"}
City Load (0.1ms)  SELECT `cities`.* FROM `cities` WHERE (`cities`.`name` = '�rhus') ORDER BY cities.name ASC
City Load (6.8ms)  SELECT `cities`.* FROM `cities` WHERE (`cities`.`name` = 'Asunci�n, Paraguay') ORDER BY cities.name ASC

Conclusion: the cities can't be found :(

But, i found an interesting thing:

  • When i made an error on the file asociated with this function, the output will be like this :

    Request

    Parameters:
    {"destination"=>"Asunción,
    Paraguay",
    "origin"=>"Århus,
    Denmark",
    "sort"=>"newest"}
    

it's a valid one!

Question:

Do you guys have an idea how to solve this? Thanks in advance :)

panggi
  • 73
  • 1
  • 5

1 Answers1

13

You're right, it looks like you have an encoding problem somewhere. The 0xC5 character is "Å" in ISO-8859-1 (AKA Latin-1), in UTF-8 it would be %C3%85 in the URL.

I suspect that you're using JavaScript on the client side and that your JavaScript is using the old escape function to build the URL, escape has some issues with non-ASCII characters. If this is the case, then you should upgrade your JavaScript to use encodeURIComponent instead. Have a look at this little demo and you'll see what I'm talking about:

http://jsfiddle.net/ambiguous/U5A3k/

If you can't change the client-side script then you can do it the hard way in Ruby using force_encoding and encoding:

>> s = CGI.unescape('%C5rhus%2C%20Denmark')
=> "\xC5rhus, Denmark"
>> s.encoding
=> #<Encoding:UTF-8>
>> s.force_encoding('iso-8859-1')
=> "\xC5rhus, Denmark"
>> s.encoding
=> #<Encoding:ISO-8859-1>
>> s.encode!('utf-8')
=> "Århus, Denmark"
>> s.encoding
=> #<Encoding:UTF-8>

You should get something like "\xC5rhus, Denmark" from params and you could unmangle that with:

s = params[:whatever].force_encoding('iso-8859-1').encode('utf-8')

Dealing with this on the server side would be a last resort though, if your client-side code is sending back incorrectly encoded data then you'll be left with a pile of guesswork on the server to figure out what encoding was actually used to get it into the URL.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
  • Yay! i used 'escape' function to build the URL in Javascript before, and now i'm using 'encodeURIComponent' as you told and it works :D Thank you for saving my day :) – panggi Jan 17 '12 at 04:39