Is there a way to normalize all Strings in all request payloads coming into a Spring app?

Question

I'm working on a Spring web app that needs to be able to handle multiple non-English languages in the future. My concern is that, since there are multiple binary representations of many/most characters with diacritical marks and some special characters commonly used in other languages, I will need to normalize all user input to ensure canonical-equivalent characters actually evaluate as equal.

I did some brief experiments, examining String input unmarshalled from a text input on a form already on the site, and found there does not appear to be any normalization happening at the moment. I actually expected Spring might normalize inputs already, but that doesn't appear to be the case, or it has been disabled somehow.

What I'm wanting to know is if there is some way to get all String user input from any site elements to be normalized via Unicode's Normalization Form C before any other operations are performed on the input.

The most relevant result I found searching here is this unanswered question from 2013. I'm still learning about Spring, and the only thing I have found that might be of use is to define a custom HttpMessageConverter (similar to the demonstration here) including normalization of Strings as part of the process, then make sure that converter is applied to all incoming JSON. However, I'm not sure if that will cover every type of input and if I would have to define a custom converter in place of every available converter (my site doesn't have any converters enabled beyond the list in 2.2 at that link). I'm really hoping there is something more universal and less messy than doing it that way, assuming that would even work.

@clav No, there are standard ways to manipulate the Unicode data to “normalize” it. See `java.text.Normalizer`. The suggestion in the linked question from 2103 is a good one... use AOP. More specifically for this case: use AOP to apply `java.text.Normalizer.normalize` to any string arguments being passed into whatever beans you’re interested in getting normalize text. — Izzy, Nov 21 '19 at 02:19
@Izzy - Can you elaborate a bit on how to go about that, please? My (admittedly rudimentary) understanding of Spring AOP is that it must use a point-cut targeting a method call. I'm not sure what method I could target that would allow the advice to normalize every String input on its way into a bean. Also, would this be an aspect that has to be manually placed on every bean that receives user input? — Cortillaen, Nov 21 '19 at 15:44

score 2 · Answer 1 · answered Nov 21 '19 at 20:45

You can use AOP to normalize input parameters like this...

Create an annotation:

package com.example;

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Target({ElementType.METHOD, ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
public @interface Normalize {
}

Create an aspect:

package com.example;

import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.springframework.stereotype.Component;

import java.text.Normalizer;

@Aspect
@Component
public class NormalizeAspect {

    @Around("@within(com.example.Normalize) || @annotation(com.example.Normalize)")
    public Object normalizeParameters(ProceedingJoinPoint joinPoint) throws Throwable {
        Object[] args = joinPoint.getArgs();
        for(int i=0; i<args.length; i++) {
            Object arg = args[i];
            if(arg instanceof String) {
                args[i] = Normalizer.normalize((String) arg, Normalizer.Form.NFKC);
            }
        }
        return joinPoint.proceed(args);
    }
}

Annotate beans or methods for which you want input normalized:

@Normalize
@RestController
public class ExampleController {
    @GetMapping("/hello")
    public String sayHello(@RequestParam(defaultValue="World") String name) {
        return String.format("Hello %s", name);
    }
}

Of course this only normalizes simple string parameters. It wouldn't, as written, affect any objects being created by a Spring Converter or ObjectMapper, though you could extend the idea to do more.

Is there a way to normalize all Strings in all request payloads coming into a Spring app?

1 Answers1