0

I have a file with cyrillics and non-cyrillics characters. However, when I read the file the cyrillics characters are not retrived and non-cyrillics characters are retrived. Here is the code I am using

private static String dirToPRocess = "D:\\stopwords_freq_v2.txt";

BufferedReader br = null;
    try {
        br = new BufferedReader(new InputStreamReader(new FileInputStream(
                dirToPRocess), "UTF-8"));
                    String line = br.readLine();
            while (line != null) {
                                System.out.println(line);
                                line = br.readLine();
                                              }
        } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
         }
             try {
        br.close();
         } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
vikifor
  • 3,426
  • 4
  • 45
  • 75

1 Answers1

0

Are you using eclipse?

You can try the following to get it to work:

save your java file with the character encoding utf-8.

If you want to print cyrillics to the console, I think there might be a setting in eclipse's properties somewhere that does that but not 100% certain- it should print cyrillics by default in my experience.

Your java code looks OK btw.

Tucker
  • 7,017
  • 9
  • 37
  • 55
  • Yes I am using Eclipse – vikifor Sep 13 '13 at 22:33
  • @vikifor did you try going into the file properties and setting the character encoding to utf-8? You can right-click the file to get to properties and then it should be easy to find – Tucker Sep 13 '13 at 22:43
  • I clicked on properties and the option Default(inherited from container: UTF-8) was checked (I didn't check this button). I also made changes in Windows > Preferences > General > Workspaces, set "Text file encoding" to "Other : UTF-8", but I still didn't get the value that is written in cyrillics. Maybe I should recreate the file again after changes in Windows>Preferences.... – vikifor Sep 13 '13 at 22:53
  • @vikifor hmm here's a related post: http://stackoverflow.com/questions/2260325/why-is-java-bufferedreader-not-reading-arabic-and-chinese-characters-correctly – Tucker Sep 13 '13 at 23:01
  • the post didn't help me – vikifor Sep 13 '13 at 23:31