2
class Solution {
public boolean isIsomorphic(String s, String t) {
    
    int [] index_s = new int [128];//创建两个int数组128位,里面的值都为0
    int [] index_t = new int [128];
    
    for (int i =0; i<s.length(); i++){
        char sc = s.charAt(i); //取每一个位置上的字符
        char tc = t.charAt(i);
        
        if (index_s[sc] != index_t[tc]){
            return false ;
        }else{
            index_s[sc]=i+1;
            index_t[tc]=i+1;
        }
        
    }
    return true ;
    
    
}
}

from the code above, in the int array , sc and tc are char variables, so what is the meaning of this code ?

  • 4
    both arrays `index_s` and `index_t` are of length 128 which is equal to the ascii table. When you do `charAt()`, you will get the character at a specific index, for example `a`. Then when you do `index_s[sc]`, the char will be converted to it's int representation in the ascii table, so `a` is 97. – Peter Haddad Jun 27 '21 at 22:53

1 Answers1

3

tl;dr

Your code counts the number of occurrences where the nth character in both input strings match, sharing the same US-ASCII number (Unicode code point number). The count is kept in an array where the indexes represent all the 128 possible US-ASCII code numbers (0-127). The code uses a trick of sorts, where the char type in Java is effectively a 16-bit integer number.

Your code works for input String objects whose values contain only US-ASCII characters. Your code fails if the inputs contain any of the other 143,731 characters defined in Unicode.

Details

As commented by Haddad, your lines fragments:

index_s[sc]
index_t[tc]

… are using variables sc and tc, both of type char as though they are integers. Those integer values are used as zero-based index numbers into the array to access a slot. This works because in Java, the char type is treated as an integer number. For a discussion of this peculiar language feature, see Can the char type be categorized as an integer?.

The Unicode code point number (same as US-ASCII numbers for the first 128, as Unicode is a superset of US-ASCII) is the numeric value extracted from each char. The letter a as a char is effectively a 97 decimal integer number, the char value of b is 98, and so on.

Your code then goes on to increment the int value in that slot of the array. The array of int primitive type is automatically initialized to zero in every slot. Your calls to =i+1 increments that slot’s number, going from 0 to 1, then 1 to 2, and so on. This incrementing in your code is a count of how often we encounter that code point number in the input strings.

The US-ASCII numbers are zero-based, running 0-127. Array indexes in Java are also zero-based. So no need to adjust. The NUL character assigned to US-ASCII number 0 will be counted in the array slot accessed using an index of 0.

Your code:

if (index_s[sc] != index_t[tc]){
            return false ;

… skips over cases where the nth character in both input strings is the same character, as the same code point number.

The last line makes no sense: return true ;. Always returning true serves no useful purpose.

By the way, the char type is now obsolete, unable to represent even half of the characters defined in Unicode. You should instead learn to work only with the Unicode code points directly as int rather than char.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • Actually, it is a code unit not a code point in this case. Either way, it is difficult to understand what this code (as written) really *means*, but it is not going to deal with Unicode correctly. (It will throw an exception if the strings contain non-ASCII characters.) – Stephen C Jun 28 '21 at 00:51