2

How do I separate a string into a list/array of white space separated words.

let x = "this is my sentence";;

And store them inan list/array like this:

 ["this", "is", "my", "sentence"]
thor
  • 21,418
  • 31
  • 87
  • 173
Leoking938
  • 71
  • 1
  • 2
  • 11
  • 1
    Possible duplicate of [Does OCaml have String.split function like Python?](http://stackoverflow.com/questions/23204953/does-ocaml-have-string-split-function-like-python) – Amadan Mar 28 '16 at 01:47
  • 1
    I would like to know the full process of doing such operation since I'm new to OCAML, and don't have the background in arrays, list and string manipulation on functional languages. The other answer provided still leaves me with doubts. – Leoking938 Mar 28 '16 at 02:28

3 Answers3

2

Using the standard library Str split_delim and the regexp type.

Str.split_delim (Str.regexp " ") "this is my sentence";;
- : bytes list = ["this"; "is"; "my"; "sentence"] 

Highly recommend getting UTop, it's really good for quickly searching through Libraries (I typed Str, saw it was there, then Str. and looked for the appropriate function).

BWStearns
  • 2,567
  • 2
  • 19
  • 33
1

The full process goes like this:

first opam install re

if you are using utop, then you can do something like this

#require "re.pcre"

let () =
  Re_pcre.split ~rex:(Re_pcre.regexp " +") "Hello world more"
  |> List.iter print_endline

and then just run it with utop code.ml

if you want to compile native code, then you'd have:

let () =
  Re_pcre.split ~rex:(Re_pcre.regexp " +") "Hello world more"
  |> List.iter print_endline

Notice how the #require is gone.

then at command line you'd do: ocamlfind ocamlopt -package re.pcre code.ml -linkpkg -o Test

The OCaml website has tons of tutorials and help, I also have a blog post designed to get you up to speed quickly: http://hyegar.com/2015/10/20/so-youre-learning-ocaml/

0

Posting from a future that involves sequences, to offer an alternative way that doesn't necessarily have to involve creating an entire list, unless you actually need that.

We can lazily iterate over a string, character by character, and use an aux function to decide when to yield a word, using an argument to that function to build up each word in turn, and to reset it after it has been yielded.

module CharSet = Set.Make (Char)

let split_words seps s =
  let rec aux seq cur () =
    match seq () with
    | Seq.Nil when cur = "" -> Seq.Nil
    | Seq.Nil -> Seq.Cons (cur, Seq.empty)
    | Seq.Cons (ch, next) ->
      let is_sep = CharSet.mem ch seps in
      if is_sep && cur = "" then 
        aux next "" ()
      else if is_sep then  
        Seq.Cons (cur, aux next "")
      else 
        aux next (Printf.sprintf "%s%c" cur ch) ()
  in
  aux (String.to_seq s) ""
# let x = "this is my sentence" in
  x
  |> split_words @@ CharSet.of_list [' '; '\t'; '\n']
  |> List.of_seq;; 
- : string list = ["this"; "is"; "my"; "sentence"]
# let x = "this is my sentence" in
  x
  |> split_words @@ CharSet.of_list [' '; '\t'; '\n']
  |> Array.of_seq;; 
- : string array = [|"this"; "is"; "my"; "sentence"|]
Chris
  • 26,361
  • 5
  • 21
  • 42