I tried to apply the below rules:
Chop the string by ;
to reach maximum length n
.
For example,
n <- 4
string <- c("a;a;aabbbb;ccddee;ff")
output <- c("a;a;", "aabb", "bb;", "ccdd", "ee;", "ff")
For "aabb"
, since the chop length "aabbbb"
exceed n
= 4, thus we chop by length, 4
.
For "bb;"
, since the chop length "bb;"
< 4, we next consider "bb;ccddee"
. However, the length of next chop exceed 4, and we already have ;
exist in the string. Thus, we chop by ;
.
Currently, I can achieved or
by using the Regex
.
num <- 4
splitvar <- ";"
## splits pattern
pattern <- paste0("(?<=.{", num, "}|", splitvar, ")")
> pattern
[1] "(?<=.{4}|;)"
string <- c("a;a;aabbbb;ccddee;ff")
strsplit(string, pattern, perl = TRUE)
[[1]]
[1] "a;" "a;" "aabb" "bb;" "ccdd" "ee;" "ff"
As you can see, we don't actually need to chop "a;"
and "a;"
, since the length doesn't exceed the n
(2 + 2 = 4).
Do anyone have solution on this? Thank you.