0

Very similar questions have been asked but I couldn't find a solution to my problem.

I have a properties file, i.e. config.properties, encoded in ISO-8859-1 with the following:

config1 = some value with âccénted characters

I have a class that loads the properties and a method to get a property value

public class EnvConfig {
    private static final Properties properties = new Properties();

    static {        
        initPropertiesFromFile();
    }

    private static void initPropertiesFromFile() {
        InputStream stream;

        try {
            stream = EnvConfig.class.getResourceAsStream("/config/config.properties");
            properties.load(new InputStreamReader(stream, Charset.forName("ISO-8859-1")));
            // Tried that as well instead of the previous line: properties.load(stream);
        } catch (Exception e) {
            // Do something
        } finally {
            stream.close();
        }
    }

    public static String getProperty(String key, String defaultValue) {
        try {
            System.out.println(Charset.defaultCharset()); // Prints UTF-8
            // return new String(properties.getProperty(key).getBytes("ISO-8859-1")); // Returns some value with �cc�nted characters
            // return new String(properties.getProperty(key).getBytes("UTF-8")); // Returns some value with �cc�nted characters
            // return new String(properties.getProperty(key).getBytes("ISO-8859-1"), "UTF-8") // Returns some value with �cc�nted characters
            return properties.getProperty(key, defaultValue); // Returns some value with �cc�nted characters
        } catch (Exception e) {
            // Do something
            return defaultValue;
        }
    }
}

I have code that does something with the property value (String) and the code needs the correct String with accents: some value with âccénted characters

public void doSomething() {
    ...
    EnvConfig.getProperty("config1"); // I need the exact same value as configured in the properties file: some value with âccénted characters; currently get some value with �cc�nted characters
    ...
}

The project is in UTF-8 (Java files are encoded in UTF-8) and project properties/settings (pom) are set to UTF-8.

What am I missing, how can I achieve this? I know there is no such thing as "String in UTF-8 format", since a String is just a sequence of UTF-16 code units. BUT how can I simply have the same "workable" output, the String with accents, as configured in the ISO-8859-1 encoded properties file, in my UTF-8 encoded code/project?

Martin
  • 137
  • 1
  • 13
  • Your reading code is almost certainly correct. The issue is probably that whatever you use to print your output doesn't support the accented characters or is configured incorrectly. How do you run your code? On what OS? – Joachim Sauer Sep 17 '19 at 15:03
  • I run it within Eclipse, on Windows, using TestNG (code runs in a test). Thanks – Martin Sep 17 '19 at 15:08
  • What happens if you just put the String `"some value with âccénted characters"` in your code (as a String literal) and print that: `System.out.println("some value with âccénted characters");`? – Joachim Sauer Sep 17 '19 at 15:10
  • It prints fine, I see the accented letters correctly: some value with âccénted characters – Martin Sep 17 '19 at 15:13
  • Then something else goes wrong with your setup. How do you verify that the config file is using ISO-8859-1? Note that *technically* .properties files in Java are specified to use ISO-8859-1, but some libraries (like Spring) use UTF-8 to read them instead (thus making them "not real" properties files), so some tools (like IDEs, build systems) might treat them as ISO-8859-1 and others as UTF-8. – Joachim Sauer Sep 17 '19 at 15:16
  • Сheck with a text editor that the accented characters are encoded as utf escape sequences. `config1 = some value with \u00E2cc\u00E9nted characters` – Vitaly Roslov Sep 17 '19 at 15:23
  • @Joachim Sauer I verified the encoding of the properties file using both Notepad++ (in the bottom right corner) and Eclipse (by doing Edit > Set Encoding ... - shows ISO-8859-1 as the current encoding). I do not use a specific library like Spring. As you said, it may be a problem with my setup. I have done similar setups in the past (and working actively on other projects) having similar code/architecture with configs having accented letters with no issue. I keep looking and I'll focus on the setup, not the code. Thanks for your help. – Martin Sep 17 '19 at 15:26

1 Answers1

1

After hours of searching, it turns out that my encoding issue is caused by resources filtering set to true in the project's POM:

    <resources>
        <resource>
            <directory>src/main/resources</directory>
            <filtering>true</filtering>
        </resource>
    </resources>

Setting this to false fixes the issue. I still need to find a way to make it work with filtering enabled so I'll try to figure it out. There are some clues in other questions/answers like Wrong encoding after activating resource filtering. Thanks.

Martin
  • 137
  • 1
  • 13