5

when I submit the character Ö from a webpage the backend recieves Ã. The webpage is part of a Spring Webflow/JSF1.2/Facelets application. When I inspect the POST with firebug I see:

Content-Type: application/x-www-form-urlencoded 
Content-Length: 74 
rapport=krediet_aanvragen&fw1=0&fw2=%C3%96ZTEKIN&fw3=0&fw4=0&zoeken=Zoeken

The character Ö is encoded as %C3%96, using this table I can see that it is the correct hexadecimal representation of the UTF-8/Unicode character Ö. However when it reaches the backend the character is changed into Ã. Using the same table I can see there is some code somewhere that tries to interpret the C3 and the 96 separately (or as unicode \u notation). U+00C3 happens to be Ã, 96 is not a visible character so that explains that.

Now I know this is a typical case of an encoding mismatch, I just don't know where to look to fix this.

The webpage contains

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

When debugging I can see the library responsible for the wrong interpration is jboss-el 2.0.0.GA, which seems right because the value is parsed to the backend in a webflow expression:

<evaluate expression="rapportCriteria.addParameter('fw2', flowScope.fw2)" />

It is put onto the flowScope by:

<evaluate expression="requestParameters.fw2" result="flowScope.fw2"/>

Nevermind the convulated way of getting the form input into the backend, this is code that tries to integrate Webflow with BIRT reports...but I have the same sympton in other webapplications.

Any idea where I have to start looking?

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
Nicolas Mommaerts
  • 3,207
  • 4
  • 35
  • 55

1 Answers1

6

I can see that it is the correct hexadecimal representation of the UTF-8/Unicode character Ö. However when it reaches the backend the character is changed into Ã.

So the client side character encoding to encode the POST body is correct, but the server side character encoding to decode the POST body not. You need to create a Filter which does basically the following in doFilter() method

request.setCharacterEncoding("UTF-8");

and map it on URL pattern of interest. Spring also already provides one out the box, the CharacterEncodingFilter which does basically the above. All you need to do is to add it to the web.xml:

<filter>
    <filter-name>characterEncodingFilter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>

<filter-mapping>
    <filter-name>characterEncodingFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

See also:


The HTML meta header is by the way irrelevant in the issue, it's ignored when the page is served over HTTP. It's the HTTP response header which instructs the webbrowser in what charset it should display the response and to send the params back to the server. This is apparently already been set properly since the POST body is correctly encoded. The HTML meta header is only been used when the user saves the page to local disk and revisits it later from local disk.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • Thanks, just found out about the CharacterEncodingFilter myself, did the job perfectly. Important for people who may encounter the same problem, the CharacterEncodingFilter needs to be the first filter defined. – Nicolas Mommaerts Apr 07 '11 at 11:40
  • You're welcome. It does not necessarily need to be the firstmost one, it depends on the job of the other filters. If you have some filters in the chain which reads the request body or forwards the request to another servlet (instead of continuing the chain), then this filter definitely needs to be in front of those kind of filters. Also note that the URL pattern can be tweaked more to match the exact pattern of all your HTML pages, so that it doesn't unnecessarily run on images/css/js/etc. `*.jsf` maybe? – BalusC Apr 07 '11 at 11:41