1

I want to split a string to an array which include each 2 word of original string as below:

Ex:
str = "how are you to day"
output = ["how are", "are you", "you to", "to day"]

Any one can give solution? thank so much!

Dung Nguyen
  • 39
  • 1
  • 7
  • From https://stackoverflow.com/questions/55004542/ruby-get-consecutive-pairs-of-elements-from-an-array ‘’’ str.split.each_cons(2).to_a ‘’’ – melcher Aug 27 '21 at 03:53

4 Answers4

2

Input

str = "how are you to day"

Code

p str.split(/\s/)
     .each_cons(2)
     .map { |str| str.join(" ") }

Output

["how are", "are you", "you to", "to day"]
Rajagopalan
  • 5,465
  • 2
  • 11
  • 29
  • Is the “map(:&itself)” basically the same as calling “to_a”? – melcher Aug 27 '21 at 03:59
  • @melcher No! `map(&:itself) = map{|x| x}` – Rajagopalan Aug 27 '21 at 04:10
  • @melcher I am realizing now that `map(&:itself)` is not necessary in my above code. – Rajagopalan Aug 27 '21 at 04:13
  • 1
    The split argument isn’t really necessary for the example given. –  Aug 27 '21 at 05:36
  • @Michael B How come ? I am splitting and converting into array first, that is very much necessary! – Rajagopalan Aug 27 '21 at 07:19
  • 1
    `split` returns an array. You can save memory by returning an enumerator if you replace `str.split(/\s/)` (or just `str.split`) with `gsub(/\w+/)`. – Cary Swoveland Aug 27 '21 at 07:21
  • @Rajagopalan The ```split``` method is necessary. Its the argument (```(/\s/)```) that is unnecessary. The ```split``` method will split on whitespace (equivalent to ```" "```) if no argument is given. Based on the example string, I see no need to run the extra regex. –  Aug 27 '21 at 07:35
  • @Rajagopalan and @Cary Swoveland, Just running some quick benchmarks and just FYI, all other things being equal, using ```str.split``` is the fastest. ```str.split(/\s/)``` takes about 60% longer to execute, and using ```gsub(/\w+/)``` takes about 140% longer to execute. –  Aug 27 '21 at 08:00
  • 1
    @Rajagopalan, you're welcome. Nothing against your suggested approach at all, but for what its worth, the alternative method I posted below is about 20% faster than even using ```str.split``` in your suggested approach. –  Aug 27 '21 at 08:06
1

Here is one approach, which uses a regex trick to duplicate the second through second to last words in the input string:

input = "how are you to day"
input = input.gsub(/(?<=\s)(\w+)(?=\s)/, "\\1 \\1")
output = input.scan(/\w+ \w+/).flatten
puts output

This prints:

how are
are you
you to
to day
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

Here are a couple ways to do that. Both use the form of String#gsub that takes a regular expression as its argument and no block, returning an enumerator. This form of gsub merely generates matches of the regular expression; it has nothing to do with string replacement.

str = "how are you to day"

Use a regular expression that contains a positive lookahead

r = /\w+(?=( \w+))/
str.gsub(r).with_object([]) { |s,a| a << s + $1 }
  #=> ["how are", "are you", "you to", "to day"]

I've chained the enumerator str.gsub(r) to Enumerator#with_object. String#gsub is a convenient replacement for String#scan when the regular expression contains capture groups. See String#scan for for an explanation of how it treats capture groups.

We can write the regular expression in free-spacing mode to make it self-documenting.

r = /
    \w+       # match >= 1 word characters
    (?=       # begin a positive lookahead
      ( \w+)  # match a space followed by >= 1 word characters and save
              # to capture group 1
    )         # end positive lookahead
    /x        # invoke free-spacing regex definition mode

Enumerate pairs of successive words in the sting

enum = str.gsub(/\w+/)
loop.with_object([]) do |_,a|
  a << enum.next + ' ' + enum.peek
end
  #=> ["how are", "are you", "you to", "to day"]

See Enumerator#next and Enumerator#peek. After next returns the last word in the string peek raises a StopIteration exception which is handled by loop by breaking out of the loop and returning the array a. See Kernel#loop.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
0

Here’s another option:

str = "how are you to day"
arr = str.split
new_arr = []
(arr.length-1).times {new_arr.push(arr[0..1].join(" ")); arr.shift}

print new_arr #=> ["how are", "are you", "you to", "to day"]