3

I'm trying to figure out how to improve the transliteration from German umlauts to ASCII for id identifiers in Pandoc. Currently there is only a mapping Char -> Maybe Char, that converts ä into a and ß into Nothing etc., but the most common convention maps ä into ae and ß into ss and so on. Here is what I have so far:

import Data.Char (isAscii)
import qualified Data.Map as M

asciiMap' :: M.Map Char String
asciiMap' = M.fromList
  [('\196',"Ae")
  ,('\214',"Oe")
  ,('\220',"Ue")
  ,('\223',"ss")
  ,('\228',"ae")
  ,('\246',"oe")
  ,('\252',"ue")
  ]

toAsciiStr :: Char -> String
toAsciiStr c | isAscii c = [c]
             | otherwise = M.findWithDefault "" c asciiMap'

myTranslit :: String -> String
myTranslit [] = []
myTranslit (x:xs) = toAsciiStr x ++ myTranslit xs

My question is about myTranslit.

Is there maybe already a built-in map-like function someMap :: (a -> [a]) -> [a] -> [a]?

Wolf
  • 9,679
  • 7
  • 62
  • 108

2 Answers2

8

Yes, what you are looking for is concatMap :: Foldable t => (a -> [b]) -> t a -> [b], that concatenates the output after mapping. Since [] is a Foldable, it thus can specialize in concatMap :: (a -> [b]) -> [a] -> [b], and further (with a ~ Char and b ~ Char) into concatMap :: (Char -> [Char]) -> [Char] -> [Char]. Note that String is an alias for type String = [Char] so, a String is nothing more that a list of Characters.

You can thus use:

myTranslit :: String -> String
myTranslit = concatMap toAsciiStr
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • 1
    Great. I tried along with your edit and now it works :) – Wolf Jun 25 '17 at 12:26
  • I was aware that `String` is `[Char]` and was actually expecting a function of the type `:: (a -> [a]) -> [a] -> [a]` maybe I should edit the question – Wolf Jun 25 '17 at 12:49
4

You can make the entire thing as concise as

myTranslit :: String -> String
myTranslit = concatMap $ \c -> case c of
  'Ä' -> "Ae"
  'Ö' -> "Oe"
  'Ü' -> "Ue"
  'ä' -> "ae"
  'ö' -> "oe"
  'ü' -> "ue"
  'ß' -> "ss"
  _ | isAscii c  = [c]
    | otherwise  = ""
leftaroundabout
  • 117,950
  • 5
  • 174
  • 319
  • ...interesting option especially in my case, the integration into Pandoc has to be changed anyway. – Wolf Jun 26 '17 at 06:49