My answer is below, but consider using @user20650's answer instead. It is much more concise and elegant (though perhaps inscrutable if you're not familiar with Regular Expressions). As per @user20650's second comment, check to make sure that it's robust enough to work on your actual data.
Here's a tidyverse
option:
library(tidyverse)
vec = c("this example sentence I have given here",
"and here is another long example")
vec.abbrev = vec %>%
map_chr(~ str_split(.x, pattern=" ", simplify=TRUE) %>%
gsub("(.{5}).*", "\\1.", .) %>%
paste(., collapse=" "))
vec.abbrev
[1] "this examp. sente. I have given. here"
[2] "and here is anoth. long examp."
In the code above, we use map_chr
to iterate over each sentence in vec
. The pipe (%>%
) passes the result of each function on to the next function.
The period character is potentially confusing, because it has more than one meaning, depending on context."(.{5}).*"
is a Regular Expression in which .
means "match any character". In "\\1."
the .
is actually a period. The final .
in gsub("(.{5}).*", "\\1.", .)
and the first .
in paste(., collapse=" ")
is a "pronoun" that represents the output of the previous function that we're passing into the current function.
Here is the process one step at a time:
# Split each string into component words and return as a list
vec.abbrev = str_split(vec, pattern=" ", simplify=FALSE)
# For each sentence, remove all letters after the fifth letter in
# a word and replace with a period
vec.abbrev = map(vec.abbrev, ~ gsub("(.{5}).*", "\\1.", .x))
# For each sentence, paste the component words back together again,
# each separated by a space, and return the result as a vector,
# rather than a list
vec.abbrev = map_chr(vec.abbrev, ~paste(.x, collapse=" "))