3

I am writing a palindrome checker in MIPS, and I was trying to make it accent insensitive so that something like "ahà" would be considered a palindrome too. However, it doesn't look so simple as the case insensitive scenario where there is a fixed value between a lowercase and an uppercase letter.

I asked my teacher about it and she said that I could check the entire string and replace any "è" with "e", then check it again to replace any "é" with "e" and so on, but she told me there is a better solution and asked me to think about it. The only thing I have noticed so far is that the accents are in the extended ASCII code, so > 127, but I can't seem to understand what to do. Can someone help me? Even just a hint would be appreciated, thank you in advance.

spiky
  • 31
  • 2
  • 1
    The best is probably to use a look up table. You fill a 256 char lut with the accent equivalence and if read char c is >127, you replace it by lut[c]. The longest is probably to fill the table, but this can be done with a loop to add a default value everywhere (just in case), and then you patch the table entries that have an equivalence that you want to process. – Alain Merigot May 23 '19 at 17:32

1 Answers1

0

You're going to have to hardcode this one with a lookup table like Alain Merigot suggested. How you do this depends on your string encoding scheme (ASCII vs. UTF-8, etc.)

For ASCII, I whipped this up and it should work:

.data

ascii_strip_accent_table:
# index: U+nnnn offset, minus 128
.space 0x40 ;table doesn't really start until U+00C0
.ascii "AAAAA"
.byte 0xC6
.ascii "C"
.ascii "EEEE"
.ascii "IIII"
.ascii "D"
.ascii "N"
.ascii "OOOOO" ;these are capital Os, not zeroes
.byte 0xD7
.ascii "O"  ;this is a capital O, not a zero
.ascii "UUUU"
.ascii "Y"
.byte 0xDE,0xDF
.ascii "aaaaa"
.byte 0xE6
.ascii "c"
.ascii "eeee"
.ascii "iiii"
.ascii "d"
.ascii "n"
.ascii "ooooo"
.byte 0xF7
.ascii "o"  
.ascii "uuuu"
.ascii "y"
.byte 0xFE
.ascii "y"

MyString:
.asciiz "Pokémon"
.text

la $a0,ascii_strip_accent_table
la $a1,MyString
li $t2,128

loop:
lbu $t0,($a1)             # read from string
beqz $t0,done           
bltu $t0,$t2,continue     # if char < 128, skip
   subu $t0,$t0,$t2       # subtract 128 to get array index
   move $a2,$a0           # backup table base
   addu $a2,$a2,$t0       # add array index to table base 
   lbu $t0,($a2)          # load from table
   sb $t0,($a1)           # store in string
continue:
addiu $a0,$a0,1
j loop

done:
li $v0,10
syscall

EDIT: Now if you're like me and you can't stand unnecessary padding, you can actually remove that .space 40 at the beginning if you la $a0,ascii_strip_accent_table-64 instead. Whether you're willing to take that risk, is up to you.

puppydrum64
  • 1,598
  • 2
  • 15