I need to count the lines of 221 poems and tried counting the line breaks \n.
However, some lines have double line breaks \n\n to make a new verse. These I only want counted as one. The amount and position of double line breaks is random in each poem.
Minimal working example:
library("quanteda")
poem1 <- "This is a line\nThis is a line\n\nAnother line\n\nAnd another one\nThis is the last one"
poem2 <- "Some poetry\n\nMore poetic stuff\nAnother very poetic line\n\nThis is the last line of the poem"
poems <- quanteda::corpus(poem1, poem2)
The resulting line count should be 5 lines for poem1
and 4 lines for poem2
.
I tried stringi::stri_count_fixed(texts(poems), pattern = "\n")
, but the regex pattern is not elaborate enough to account for the random double line break problem.