1

I am trying to learn F#. And I need som help with a simple soundex expression. I am using the following ruleset for Simplified (also called American) soundex:

1.) Assign characters to classes
2.) Remove duplicate values here, e.g. 222 becomes 2  
3.) Replace first encoded char with first char  
4.) Remove nulls
5.) Truncate ot pad to totally 4 characters

Currently I am stuck on rule no. 2. I was thinking of using a recursive expression. As I am currently a n00b on F# I will try to ask you for an elegant solution to my problem.Maybe my entire approach to translate text to soundex is off target?

Any suggestions will be greatly appreciated :)

Here is my code:

let Simplified (name:string) =
let ca = name.ToLower().ToCharArray()
new string(
    Array.map(
        fun e ->
        match e with                                                          
            | 'a' | 'e' | 'i' | 'o' | 'u' | 'y' | 'w' | 'h' -> '0'
            | 'b' | 'f' | 'p' | 'v'                         -> '1'
            | 'c' | 's' | 'k' | 'g' | 'j' | 'q' | 'x' | 'z' -> '2'
            | 'd' | 't'                                     -> '3'
            | 'l'                                           -> '4'
            | 'm' | 'n'                                     -> '5'
            | 'r'                                           -> '6'
            |  _                                            -> ' '
        )  ca
  //|> fun s -> TODO: Remove duplicates here
    |> fun s -> Array.set s 0 (ca.[0]) 
                Array.choose(fun e -> if e <> '0' then Some(e) else None) s   
)  
|> fun s -> (
            match s.Length with                                               
                | x when x < 3 -> s.PadRight(4, '0')
                | _ -> s.Substring(0, 4)
            ).ToUpper()
Slurken
  • 13
  • 3

4 Answers4

4

If you want to remove consequent duplicates (the second option in the zeuxcg's solution), then you can also implement this directly as a recursive function (using accumulator parameter). This nicely demonstrates pattern matching, so it is good thing to try while learning F#:

let removeConsequentDuplicates list = 
  let rec loop acc list =
    match list with 
    | x1::x2::xs when x1 = x2 -> loop acc (x2::xs)
    | x::xs -> loop (x::acc) xs
    | _ -> acc |> List.rev
  loop [] list

This version works with lists, but since you're working with arrays, you'll probably need an imperative version. You can use sequence expressions like this:

let removeConsequentDuplicates (arr:_[]) = 
  let rec loop last i = seq {
    if i < arr.Length - 1 && last = arr.[i] then 
      yield! loop last (i+1)
    elif i < arr.Length - 1 then
      yield arr.[i]
      yield! loop (arr.[i]) (i + 1) }
  [| if arr.Length > 0 then
       yield arr.[0]
       yield! loop arr.[0] 0 |]

As a side-note, I find your syntax a bit unreadable. I don't think it is a good idea to write ... |> fun s -> ..., because it is just an obscured version of let s = ... in .... I would recommend writing something like (I'm not sure I fully understand your code, but you get the idea...):

let Simplified (name:string) =
  let ca = name.ToLower().ToCharArray()
  let s = 
    ca |> Array.map (function
            | '0' ... )
       |> removeConsequentDuplicates
  Array.set s 0 (ca.[0])
  let s = s |> Array.choose(fun e -> if e <> '0' then Some(e) else None)
  let s = (new String(s)).ToUpper()
  match s.Length with                                               
  | x when x < 3 -> s.PadRight(4, '0')
  | _ -> s.Substring(0, 4)
Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
2

Remove consecutive duplicates using arrays using loops rather than recursion, most simply in a sequence expression like this:

let removeDuplicates (xs: _ []) =
  [|if xs.Length > 0 then yield xs.[0]
    for i=1 to xs.Length-1 do
      if xs.[i] <> xs.[i-1] then
        yield xs.[i]|]
J D
  • 48,105
  • 13
  • 171
  • 274
1

If you want to remove all duplicates from the array (leaving unique elements), the following will do:

arr |> Seq.distinct |> Seq.toArray

If you want to remove consecutive duplicates, then the solution is harder. This is the simplest one I can come up with:

let unique list =
    list
    |> List.fold (fun acc e ->
        match acc with
        | x::xs when x = e -> acc
        | _ -> e::acc) []
    |> List.rev

You can do it with arrays either via Array.toList and Array.ofList or using Array.fold and changing match expression and list construction; the code is less readable so I'm posting the List version.

Alternative solutions involve Seq.pairwise, i.e.:

let unique arr =
    if Array.isEmpty arr then
        arr
    else
        Array.append [|arr.[0]|] (
            arr
            |> Seq.pairwise
            |> Seq.toArray
            |> Array.choose (fun (p, n) -> if p = n then None else Some n))
zeuxcg
  • 9,216
  • 1
  • 26
  • 33
  • Yes, I assumed that the task is to remove consecutive duplicates; otherwise the solution is just Seq.distinct |> Seq.toArray - I'll update the post. – zeuxcg Jan 28 '11 at 13:42
  • don't forget about poor neglected foldBack. – gradbot Jan 28 '11 at 16:03
1

Seq.fold is your friend.

let soundex (text : string) = 
    let choose = 
        function 
        | 'b' | 'f' | 'p' | 'v' -> Some "1" 
        | 'c' | 'g' | 'j' | 'k' | 'q' | 's' | 'x' | 'z' -> Some "2" 
        | 'd' | 't' -> Some "3" 
        | 'l' -> Some"4" 
        | 'm' | 'n'  -> Some "5"
        | 'r' -> Some "6"
        | _ -> None 

    let fold state value = 
        match state with
        | i :: _ when i = value -> state
        | _ -> value :: state

    let t = text.Substring(1).ToLower() |> Seq.choose choose |> Seq.fold fold [] |> Seq.toList |> List.rev |> String.concat ""

    text.Substring(0,1) + t.PadRight(3, '0').Substring(0, 3)

This is based on the wikipedia article for soundex.

Huusom
  • 5,654
  • 1
  • 17
  • 15