1

What is that char ? and how to remove it from a String? I got it from a BufferedReader and i got it because i read the contents in a char array and this array has to be assigned to a particular size.So, i got the String like that "aaaaaaa����", and I tried trim and subString but didn't change anything:

 String a = "aaaaaaa����";
//subString
    int i = a.lastIndexOf("a");
    a = a.substring(0, i+1);
//trim
    a = a.trim();

And this is my way to read the input:

BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
char[] a = new char[1000];
int line;
String responseLine, server_response = "";
while((line = in.read(a)) != -1) {
      responseLine = String.valueOf(a);
      server_response = server_response + responseLine;
     }
in.close();
return server_response;
MRefaat
  • 515
  • 2
  • 8
  • 22
  • 1
    Probably there are encoding problems? – donfuxx Mar 09 '14 at 13:50
  • @donfuxx I have doubts in that too, but I don't know how to handle it – MRefaat Mar 09 '14 at 13:51
  • @Wooble I'm pretty sure that this is not related to the data as I'm already now what's the data – MRefaat Mar 09 '14 at 13:54
  • 2
    You open a `Reader` without specifying the encoding; as such the default JRE ecoding will be used. Is that what you want? – fge Mar 09 '14 at 14:21
  • No, as the problem not comes from reading the input string, it comes from converting the non-filled char_array to a String – MRefaat Mar 09 '14 at 14:23
  • 5
    Sorry to contradict you, but that may very well be the problem. Don't forget that a `Reader` takes the bytes of the stream and converts them to chars _depending on the encoding_. You never send `char`s down the wire, only bytes. The fact that you have bizarre characters showing up in your resulting strings is a sign that you don't use the correct encoding. Ultimately, a `String` is an array of `char`s, not bytes – fge Mar 09 '14 at 14:25
  • Notably, a Java `char` is a very different thing from a C `char`. – chrylis -cautiouslyoptimistic- Mar 09 '14 at 14:36

5 Answers5

2

This is very likely to be an encoding problem; you do not specify the encoding on your InputStreamReader, as such the system default is used.

Try and use:

new InputStreamReader(socket.getInputStream(), StandardCharsets.UTF_8)

instead.

If you are still stuck with JDK 6, replace StandardCharsets.UTF_8 with Charset.forName("UTF-8").

If you are unsure what encoding is used at the other end, you should not use a Reader but read the contents into a byte array. Then you can use a CharsetDecoder to try and map the bytes read into one or more encodings.

Example:

StandardCharsets.ASCII.newDecoder()
fge
  • 119,121
  • 33
  • 254
  • 329
  • 1
    What didn't solve the problem? Have you tried the `CharsetDecoder` way? Do you at least know the encoding used by the other end? – fge Mar 09 '14 at 14:53
1

Try with unicode

Unicode corresponding to � is \ufffd

String str0 = "aaaaaaa����";
System.out.println(str0.replaceAll("\ufffd", ""));
Rakesh KR
  • 6,357
  • 5
  • 40
  • 55
1

finally i found a way to solve that, it's not a professional one but efficient enough. all i had to do is filling the char array with white spaces just before starting the while loop and then after receiving the whole response i have just to trim it before returning it :

BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
char[] a = new char[1000];
int line;
String responseLine, server_response = "";
for(int i = 0; i < a.length; i++){ //
      a[i] = ' ';                  // this is the for loop i added
    }                              //
while((line = in.read(a)) != -1) {
      responseLine = String.valueOf(a);
      server_response = server_response + responseLine;
      for(int i = 0; i < a.length; i++){ //
          a[i] = ' ';                    // this is the for loop i added
        }                                //
     }
in.close();
return server_response.trim();     // this is where i return the response trimmed 
MRefaat
  • 515
  • 2
  • 8
  • 22
0

you could handle it like this:

System.out.println("aaaaaaa����".replace("�", ""));

remaining string will be aaaaaaa.

I recommend to investigate the input source though and figure out why you get those chars there. Probably there is somewhere an issue with the encoding.

donfuxx
  • 11,277
  • 6
  • 44
  • 76
  • I tried that, it didn't agree to save and tells me `some characters cannot be mapped using "Cp1252"` – MRefaat Mar 09 '14 at 14:07
  • 1
    seems like this encoding is not supported. Try using utf-8 encoding in your reader like suggested here: http://stackoverflow.com/a/4964652/2399024 – donfuxx Mar 09 '14 at 14:20
  • the problem not comes from reading the input string, it comes from converting the non-filled char_array to a String – MRefaat Mar 09 '14 at 14:24
0

If your only expecting numbers and characters you can run a for loop on the byte array and run Char.isLetterOrDigit on each character replacing those that arent with ""

user3293629
  • 119
  • 1
  • 8
  • there are some special chars – MRefaat Mar 09 '14 at 14:05
  • This will return false if the character isn't alphanumeric So you can use this to find the special chars and set them to "" – user3293629 Mar 09 '14 at 14:10
  • there are some special chars that I need to be existed not to be removed – MRefaat Mar 09 '14 at 14:11
  • If that's the only character you need removed you can try to compare each char to that character and replace – user3293629 Mar 09 '14 at 14:23
  • I can't write that character into my code in order to compare it, it gives me `some characters cannot be mapped using "Cp1252"` – MRefaat Mar 09 '14 at 14:42
  • Do you know specifically what special character you want? – user3293629 Mar 09 '14 at 15:23
  • 1
    @MRefaat _it gives me some characters cannot be mapped using "Cp1252"_: are you using Eclipse IDE? Try setting your workspace text file encoding to UTF-8. You can find it in the Preferences Menu: Window -> Preferences -> General -> Workspace, then select "Other: UTF-8" as "Text file encoding" – Modus Tollens Mar 10 '14 at 14:39
  • @KatjaChristiansen that really enabled me to read the char but it didn't replaced from the output when i made some replacing processes – MRefaat Mar 11 '14 at 07:32