Questions tagged [stringi]

stringi is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

's stringi package provides a platform independent way of manipulating strings. It is built on the library and has a syntax inspired by the package.

Repositories

Other resources

Related tags

298 questions
5
votes
1 answer

How to use back reference with stringi package?

In R I can use \\1 to reference to a capturing group. However, when using the stringi package, this doesn't work as expected. library(stringi) fileName <- "hello-you.lst" (fileName <- stri_replace_first_regex(fileName, "(.*)\\.lst$", "\\1")) [1]…
Bram Vanroy
  • 27,032
  • 24
  • 137
  • 239
5
votes
2 answers

different output using stringi and gsub (using the same pattern on the same string)

I wish to know why I obtain two different output strings by using gsub and stringi. Does the metacharacter "." not include new lines in stringi? Does stringi read "line by line"? By the way I did not find any way to perform the "correct"…
Dario Lacan
  • 1,099
  • 1
  • 11
  • 25
5
votes
2 answers

Retrieving sentence score based on values of words in a dictionary

Edited df and dict I have a data frame containing sentences: df <- data_frame(text = c("I love pandas", "I hate monkeys", "pandas pandas pandas", "monkeys monkeys")) And a dictionary containing words and their corresponding scores: dict <-…
Steven Beaupré
  • 21,343
  • 7
  • 57
  • 77
5
votes
2 answers

Transliterate Latin to ancient Greek letters

There is a simple way to transform Latin letters to Greek letters, using the stringi package for R which relies on ICU's transliterator here: library(stringi) stri_trans_general("abcd", "latin-greek") Is there a similar simple way to convert Latin…
ckluss
  • 1,477
  • 4
  • 21
  • 33
4
votes
4 answers

How can I split the following string using R?

I want to split the following character string from a chess game into separate strings like the ones below removing the "1-9." pattern while maintaining all the other text. Example: text <- "1. e4 e5 2. Nf3 Nf6 3. Nxe5 d6 4. Nd3 Nxe4 5. Qe2 Qe7 6.…
PabloAB
  • 233
  • 1
  • 6
4
votes
2 answers

R install package stringi in renv

I am trying to install the package stringi with renv::install(). Normally, I would use install.packages('stringi', configure.vars='ICUDT_DIR=path/to/icudt61l.zip/') To specify the location of icudt61l.zip dependency. How can I do this in renv ? I…
gdevaux
  • 2,308
  • 2
  • 10
  • 19
4
votes
2 answers

Exclude "I" and "O" from alpha-numeric id in stringi character set

I know from Generate unique alphanumeric IDs, that I can use stringi and stri_rand_strings to generate a unique alpha-numeric id. I am trying to figure out an efficient way to do so but only include the numbers 0-9 and all letters but "I" and "O". …
MatthewR
  • 2,660
  • 5
  • 26
  • 37
4
votes
3 answers

String replace with regex condition

I have a pattern that I want to match and replace with an X. However, I only want the pattern to be replaced if the preceding character is either an A, B or not preceeded by any character (beginning of string). I know how to replace patterns using…
Nivel
  • 629
  • 4
  • 12
4
votes
2 answers

Add a white-space between number and special character condition R

I'm trying to use stringr or R base calls to conditionally add a white-space for instances in a large vector where there is a numeric value then a special character - in this case a $ sign without a space. str_pad doesn't appear to allow for a…
js80
  • 385
  • 2
  • 11
4
votes
2 answers

Count unique string patterns in a row

i have a following example: dat <- read.table(text="index string 1 'I have first and second' 2 'I have first, first' 3 'I have second and first and thirdeen'", header=TRUE) toMatch <- c('first', 'second', 'third') dat$count <-…
LMach
  • 43
  • 3
4
votes
1 answer

Cannot install stringi since Xcode Command Line Tools update

System: macOS Sierra 10.12.6 Xcode: 9.2 (2347) R: 3.4.0 RStudio: 1.1.383 I'm attempting to install the latest version of stringi (1.1.6). This isn't possible since the most recent update to Xcode. The error received is configure: error: C…
Serenthia
  • 1,222
  • 4
  • 22
  • 40
4
votes
1 answer

Converting accents to ASCII in R

I'm trying to convert special characters to ASCII in R. I tried using Hadley's advice in this question: stringi::stri_trans_general('Jos\xe9', 'latin-ascii') But I get "Jos�". I'm using stringi v1.1.1. I'm running a Mac. My friends who are running…
Huey
  • 2,714
  • 6
  • 28
  • 34
4
votes
2 answers

Extract words between doubles quotes in a variable in R

I want to extract the name from the following input which is of the form as shown in brackets # Example of the input in brackets('name":"Tale") name<- c('name":"Tale"','name":"List"') I want to extract the names between the quotes as shown below.…
user3570187
  • 1,743
  • 3
  • 17
  • 34
4
votes
5 answers

How to split a string from right-to-left, like Python's rsplit()?

Suppose a vector: xx.1 <- c("zz_ZZ_uu_d", "II_OO_d") I want to get a new vector splitted from right most and only split once. The expected results would be: c("zz_ZZ_uu", "d", "II_OO", "d"). It would be like python's rsplit() function. My current…
ccshao
  • 499
  • 2
  • 8
  • 19
3
votes
1 answer

How does strsplit work on fixed elements with the splitter at the end of the string to split?

I was working on a language parser and I wanted to count certain string elements (say "") in a larger string. Since the string has been cleansed (str.trim), it doesn't have any content after it. I was getting some weird behavior on strsplit as…
mshaffer
  • 959
  • 1
  • 9
  • 19
1 2
3
19 20