There are two ways you can approach this problem:
- break the string into words, possibly modify each word and join the words back together; or
- use a regular expression.
I will say something about the latter, but I believe your exercise concerns the former--which is the approach you've taken--so I will concentrate on that.
Split string into words
You use String#split(' ')
to split the string into words:
str = "a title of a\t book"
a = str.split(' ')
#=> ["a", "title", "of", "a", "book"]
That's fine, even when there's extra whitespace, but one normally writes that:
str.split
#=> ["a", "title", "of", "a", "book"]
Both ways are the same as
str.split(/\s+/)
#=> ["a", "title", "of", "a", "book"]
Notice that I've used the variable a
to signify that an array is return. Some may feel that is not sufficiently descriptive, but I believe it's better than s
, which is a little confusing. :-)
Create enumerators
Next you send the method Enumerable#each_with_index
to create an enumerator:
enum0 = a.each_with_index
# => #<Enumerator: ["a", "title", "of", "a", "book"]:each_with_index>
To see the contents of the enumerator, convert enum0
to an array:
enum0.to_a
#=> [["a", 0], ["title", 1], ["of", 2], ["a", 3], ["book", 4]]
You've used each_with_index
because the first word--the one with index 0
-- is to be treated differently than the others. That's fine.
So far, so good, but at this point you need to use Enumerable#map
to convert each element of enum0
to an appropriate value. For example, the first value, ["a", 0]
is to be converted to "A", the next is to be converted to "Title" and the third to "of".
Therefore, you need to send the method Enumerable#map
to enum0
:
enum1 = enum.map
#=> #<Enumerator: #<Enumerator: ["a", "title", "of", "a",
"book"]:each_with_index>:map>
enum1.to_a
#=> [["a", 0], ["title", 1], ["of", 2], ["a", 3], ["book", 4]]
As you see, this creates a new enumerator, which could think of as a "compound" enumerator.
The elements of enum1
will be passed into the block by Array#each
.
Invoke the enumerator and join
You want to a capitalize the first word and all other words other than those that begin with an article. We therefore must define some articles:
articles = %w{a of it} # and more
#=> ["a", "of", "it"]
b = enum1.each do |w,i|
case i
when 0 then w.capitalize
else articles.include?(w) ? w.downcase : w.capitalize
end
end
#=> ["A", "Title", "of", "a", "Book"]
and lastly we join the array with one space between each word:
b.join(' ')
=> "A Title of a Book"
Review details of calculation
Let's go back to the calculation of b
. The first element of enum1
is passed into the block and assigned to the block variables:
w, i = ["a", 0] #=> ["a", 0]
w #=> "a"
i #=> 0
so we execute:
case 0
when 0 then "a".capitalize
else articles.include?("a") ? "a".downcase : "a".capitalize
end
which returns "a".capitalize => "A"
. Similarly, when the next element of enum1
is passed to the block:
w, i = ["title", 1] #=> ["title", 1]
w #=> "title"
i #=> 1
case 1
when 0 then "title".capitalize
else articles.include?("title") ? "title".downcase : "title".capitalize
end
which returns "Title" since articles.include?("title") => false
. Next:
w, i = ["of", 2] #=> ["of", 2]
w #=> "of"
i #=> 2
case 2
when 0 then "of".capitalize
else articles.include?("of") ? "of".downcase : "of".capitalize
end
which returns "of" since articles.include?("of") => true
.
Chaining operations
Putting this together, we have:
str.split.each_with_index.map do |w,i|
case i
when 0 then w.capitalize
else articles.include?(w) ? w.downcase : w.capitalize
end
end
#=> ["A", "Title", "of", "a", "Book"]
Alternative calculation
Another way to do this, without using each_with_index
, is like this:
first_word, *remaining_words = str.split
first_word
#=> "a"
remaining_words
#=> ["title", "of", "a", "book"]
"#{first_word.capitalize} #{ remaining_words.map { |w|
articles.include?(w) ? w.downcase : w.capitalize }.join(' ') }"
#=> "A Title of a Book"
Using a regular expression
str = "a title of a book"
str.gsub(/(^\w+)|(\w+)/) do
$1 ? $1.capitalize :
articles.include?($2) ? $2 : $2.capitalize
end
#=> "A Title of a Book"
The regular expression "captures" [(...)
] a word at the beginning of the string [(^\w+)
] or [|
] a word that is not necessarily at the beginning of string [(\w+)
]. The contents of the two capture groups are assigned to the global variables $1
and $2
, respectively.
Therefore, stepping through the words of the string, the first word, "a"
, is captured by capture group #1, so (\w+)
is not evaluated. Each subsequent word is not captured by capture group #1 (so $1 => nil
), but is captured by capture group #2. Hence, if $1
is not nil
, we capitalize the (first) word (of the sentence); else we capitalize $2
if the word is not an article and leave it unchanged if it is an article.