Suppose I have the following text:
txt <- as.character("this is just a test! i'm not sure if this is O.K. or if it will work? who knows. regex is sorta new to me.. There are certain cases that I may not figure out?? sad! ^_^")
I want to capitalize the first alphabetical character of a sentence.
I figured out the regular expression to match as: ^|[[:alnum:]]+[[:alnum:]]+[.!?]+[[:space:]]*[[:space:]]+[[:alnum:]]
A call to gregexpr
returns:
> gregexpr("^|[[:alnum:]]+[[:alnum:]]+[.!?]+[[:space:]]*[[:space:]]+[[:alnum:]]", txt)
[[1]]
[1] 1 16 65 75 104 156
attr(,"match.length")
[1] 0 7 7 8 7 8
attr(,"useBytes")
[1] TRUE
Which are the correct substring indices that match.
However, how do I implement this to properly capitalize the characters I need? I'm assuming I have to strsplit
and then... ?