1

Important:

I must use plain windows notepad only (neither IDE nor Notepad++ or any other text editors allowed).

So I have a simple class:

class Test{
    public static void main(String[] args){
       char c = 'қ';
       System.out.println(c);
    }
 }

By default notepad saves text files using ANSII encoding, but as you can see I have a non-ANSII character in my code. I can compile and run this code via command prompt, but output is ? instead of қ, which seems obvious. When I change the file's encoding to UTF-8, compiler throws an error. I have read this article Illegal Character when trying to compile java code but there is no solution for my particular problem, because as I wrote above, I am not allowed to use any text editors but Windows notepad.

Thank you!

  • I copied the same code. When I ran through visual studio code, same o/p as mentioned. But unable to compile through cmd getting following error: a.java:1: error: illegal character: '\u00bb' ∩╗┐class Test{ ^ a.java:1: error: illegal character: '\u00bf' ∩╗┐class Test{ ^ a.java:3: error: unclosed character literal char c = '╥¢'; ^ a.java:3: error: illegal character: '\u203a' char c = '╥¢'; ^ a.java:3: error: unclosed character literal char c = '╥¢'; ^ 5 errors – Suman Dey Jul 29 '19 at 16:48
  • You can not do it using Windows Notepad. – Sambit Jul 29 '19 at 16:55
  • @SumanDey As I understood the reason is a character called BOM that Windows appends at the beginning of a file to signal that this file is using a non-ASCII encoding. But how to fix this? – Mukhamedali Zhadigerov Jul 29 '19 at 16:57
  • How are you telling javac which character encoding your source file uses? – Tom Blodget Jul 29 '19 at 17:00
  • @TomBlodget I am not telling javac anything. If you mean whether or not am writing "javac -encoding UTF8 Test.java", yes I tried but it didn't work. – Mukhamedali Zhadigerov Jul 29 '19 at 18:23
  • Every human and program must be told which character encoding a text file uses; though some are willing to guess, if that's what you want. – Tom Blodget Jul 29 '19 at 18:37
  • @TomBlodget No, actually I was just wondering why it is not working. Obviously, I will not use notepad while writing a real code. Also, I could not even type this character manually in cmd, so probably windows cmd does not support it at all. – Mukhamedali Zhadigerov Jul 29 '19 at 18:43

2 Answers2

1

Probably you need like this:

char c = '\u039A'; 

I don't know the code of your 'k', but you may find it on https://www.ssec.wisc.edu/~tomw/java/unicode.html

Also hopes that Windows has this character for output in the console

p.s. The console of windows has a certain code page. Try to change it in console, for example:

REM change CHCP to UTF-8
CHCP 65001
CLS

and remember about different fonts in windows console, some of them can't draw specific symbols.

Anatoly
  • 54
  • 4
0

Yes, the problem is that javac is non-compliant in not accepting the BOM with UTF-8.

Use Notepad to save as Unicode (actually UTF-16LE).

Compile with

javac -encoding UTF-16 Test.java
David Veszelovszki
  • 2,574
  • 1
  • 24
  • 23
Tom Blodget
  • 20,260
  • 3
  • 39
  • 72
  • What is it failing to comply with? Does the Unicode specification require text processors to intuit encoding from an initial BOM? – VGR Jul 29 '19 at 17:39
  • No, it requires them to accept a BOM except when explicitly given an encoding that has a byte ordering and that is given too. (Would have been much simpler if they would have prohibited it when the encoding does not have a byte order. ) – Tom Blodget Jul 29 '19 at 17:42
  • It appears that when given an encoding that has a byte ordering (other than UTF-8), a BOM is forbidden. From https://unicode.org/faq/utf_bom.html#bom9: “…if a text data stream is marked as UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE, a BOM is neither necessary nor *permitted.*” (emphasis theirs) I agree with your command line solution, though. – VGR Jul 29 '19 at 18:23
  • It compiled, but output is still `?` instead of `қ` – Mukhamedali Zhadigerov Jul 29 '19 at 18:34
  • @MukhamedaliZhadigerov Are you sure the `қ` character can be displayed in the command window? – VGR Jul 29 '19 at 18:35
  • @VGR No, I am not. How to check it? – Mukhamedali Zhadigerov Jul 29 '19 at 18:36
  • @MukhamedaliZhadigerov Try to paste the character into the command window directly. – VGR Jul 29 '19 at 18:37
  • @VGR Interesting... I cannot even type it in the cmd. Why is that? Any ideas? – Mukhamedali Zhadigerov Jul 29 '19 at 18:39
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/197177/discussion-between-mukhamedali-zhadigerov-and-vgr). – Mukhamedali Zhadigerov Jul 29 '19 at 18:47