How to split each 2 word in string into array - Ruby?

Question

I want to split a string to an array which include each 2 word of original string as below:

Ex:
str = "how are you to day"
output = ["how are", "are you", "you to", "to day"]

Any one can give solution? thank so much!

From https://stackoverflow.com/questions/55004542/ruby-get-consecutive-pairs-of-elements-from-an-array ‘’’ str.split.each_cons(2).to_a ‘’’ — melcher, Aug 27 '21 at 03:53

Rajagopalan · Accepted Answer · 2021-08-27T04:14:29.840

2

Input

str = "how are you to day"

Code

p str.split(/\s/)
     .each_cons(2)
     .map { |str| str.join(" ") }

Output

["how are", "are you", "you to", "to day"]

edited Aug 27 '21 at 04:14

answered Aug 27 '21 at 03:45

Rajagopalan

5,465
2
11
29

Is the “map(:&itself)” basically the same as calling “to_a”? – melcher Aug 27 '21 at 03:59
@melcher No! `map(&:itself) = map{|x| x}` – Rajagopalan Aug 27 '21 at 04:10
@melcher I am realizing now that `map(&:itself)` is not necessary in my above code. – Rajagopalan Aug 27 '21 at 04:13
1

The split argument isn’t really necessary for the example given. – Aug 27 '21 at 05:36
@Michael B How come ? I am splitting and converting into array first, that is very much necessary! – Rajagopalan Aug 27 '21 at 07:19
1

`split` returns an array. You can save memory by returning an enumerator if you replace `str.split(/\s/)` (or just `str.split`) with `gsub(/\w+/)`. – Cary Swoveland Aug 27 '21 at 07:21
@Rajagopalan The ```split``` method is necessary. Its the argument (```(/\s/)```) that is unnecessary. The ```split``` method will split on whitespace (equivalent to ```" "```) if no argument is given. Based on the example string, I see no need to run the extra regex. – Aug 27 '21 at 07:35
@Rajagopalan and @Cary Swoveland, Just running some quick benchmarks and just FYI, all other things being equal, using ```str.split``` is the fastest. ```str.split(/\s/)``` takes about 60% longer to execute, and using ```gsub(/\w+/)``` takes about 140% longer to execute. – Aug 27 '21 at 08:00
1

@Rajagopalan, you're welcome. Nothing against your suggested approach at all, but for what its worth, the alternative method I posted below is about 20% faster than even using ```str.split``` in your suggested approach. – Aug 27 '21 at 08:06

score 1 · Answer 2 · answered Aug 27 '21 at 03:40

Here is one approach, which uses a regex trick to duplicate the second through second to last words in the input string:

input = "how are you to day"
input = input.gsub(/(?<=\s)(\w+)(?=\s)/, "\\1 \\1")
output = input.scan(/\w+ \w+/).flatten
puts output

This prints:

how are
are you
you to
to day

Cary Swoveland · Answer 3 · 2021-08-27T07:22:38.493

Here are a couple ways to do that. Both use the form of String#gsub that takes a regular expression as its argument and no block, returning an enumerator. This form of gsub merely generates matches of the regular expression; it has nothing to do with string replacement.

str = "how are you to day"

Use a regular expression that contains a positive lookahead

r = /\w+(?=( \w+))/
str.gsub(r).with_object([]) { |s,a| a << s + $1 }
  #=> ["how are", "are you", "you to", "to day"]

I've chained the enumerator str.gsub(r) to Enumerator#with_object. String#gsub is a convenient replacement for String#scan when the regular expression contains capture groups. See String#scan for for an explanation of how it treats capture groups.

We can write the regular expression in free-spacing mode to make it self-documenting.

r = /
    \w+       # match >= 1 word characters
    (?=       # begin a positive lookahead
      ( \w+)  # match a space followed by >= 1 word characters and save
              # to capture group 1
    )         # end positive lookahead
    /x        # invoke free-spacing regex definition mode

Enumerate pairs of successive words in the sting

enum = str.gsub(/\w+/)
loop.with_object([]) do |_,a|
  a << enum.next + ' ' + enum.peek
end
  #=> ["how are", "are you", "you to", "to day"]

See Enumerator#next and Enumerator#peek. After next returns the last word in the string peek raises a StopIteration exception which is handled by loop by breaking out of the loop and returning the array a. See Kernel#loop.

@Rajagopalan, thanks, but you are misguided! I'm just a Ruby hobbiest. — Cary Swoveland, Aug 27 '21 at 07:25

score 0 · Answer 4 · 2021-08-27T05:35:11.873

0

Here’s another option:

str = "how are you to day"
arr = str.split
new_arr = []
(arr.length-1).times {new_arr.push(arr[0..1].join(" ")); arr.shift}

print new_arr #=> ["how are", "are you", "you to", "to day"]

edited Aug 27 '21 at 05:35

answered Aug 27 '21 at 04:42

...or drop `new_arr = []` and replace the next line with `(arr.length-1).times.map { |i| arr[i] + ' ' + arr[i+1] }`. – Cary Swoveland Aug 27 '21 at 18:04

How to split each 2 word in string into array - Ruby?

4 Answers4