I have a string like café and I need to translate it to cafe.
I tried (string-normalize-nfd "café")
but it returns cafe a quotation mark with an accent, and `(string-normalize-nfd alguém) returns alguem with accent on m.
How can I translate the accented string to a non-accented string?

- 232,561
- 37
- 312
- 386

- 119
- 7
-
You can use iconv – amirouche Jun 14 '19 at 00:14
3 Answers
I can't think of a built-in procedure that does what you need, but it's easy to write your own implementation:
; maps accented chars to unaccented chars
(define translate
'#hash((#\á . #\a)
(#\é . #\e)
(#\í . #\i)
(#\ó . #\o)
(#\ú . #\u)))
(define (remove-accents str)
(apply string ; convert char list back into string
; for each char: replace it with non-accented
; version, if not present leave it unmodified
(map (λ (c) (hash-ref translate c (const c)))
(string->list str)))) ; convert string to char list
Be sure to add more mappings as needed, for instance to include uppercase chars, etc. It works as expected:
(remove-accents "café")
=> "cafe"

- 232,561
- 37
- 312
- 386
Your question is not really one about Racket; it's about Unicode normalization. The function that you're referring to performs the "Canonical Normalization" described on this page.
It appears to me that the best way to do what you want might be to perform the normalization and then strip out any accent characters, if you know that the original string doesn't contain accent characters.

- 16,895
- 3
- 37
- 52
-
The Racket [`string-normalize-{nfc nfd nfkc nfkd}`](https://docs.racket-lang.org/reference/strings.html#%28def._%28%28quote._~23~25kernel%29._string-normalize-nfd%29%29) functions do what's described on that page. – Greg Hendershott Aug 04 '18 at 17:57
You have the right idea to use string-normalize-nfd
-- and it's actually working! It's just that Racket strings are UTF-8 and print composed or decomposed the same.
(string-normalize-nfd "café") ;Racket prints UTF-8 string as "café"
You can see that it worked, if you convert the string to bytes:
(string->bytes/utf-8 (string-normalize-nfd "café")) ;#"cafe\314\201"
Given that, here's a rough cut at a function. I'd be surprised if this were exactly correct for all cases. But hopefully it's enough to get you on your way and you can refine it.
(define (ascii-ize s)
(list->string
(for/list ([b (in-bytes (string->bytes/utf-8
(string-normalize-nfd s)))]
#:when (< b 128))
(integer->char b))))
(ascii-ize "café") ;"cafe"
(ascii-ize "alguém") ;"alguem"

- 16,100
- 6
- 36
- 53