Questions tagged [stringi]

stringi is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

's stringi package provides a platform independent way of manipulating strings. It is built on the library and has a syntax inspired by the package.

Repositories

Other resources

Related tags

298 questions
0
votes
0 answers

Regex error: pattern exceeds limits on size or complexity

I have a dataframe of ~20,0000 observations, I am focused specifically on a column that has abstracts of scientific journals. I am attempting to pull plant species out of these abstracts. So I wanted to use this function to do so... find.all.matches…
0
votes
1 answer

How to convert a long vector of class character containing non-ASCII unicode characters to their escaped version?

I have an R package in which I have a list of university names that I want to match to the user input. The list of names contains special characters and this is generating a warning in R CMD check: checking data for non-ASCII characters (855ms) …
rempsyc
  • 785
  • 5
  • 24
0
votes
1 answer

Detecting International Phonetic Alphabet (IPA) symbols / character blocks from word strings using R

I am trying to figure out the best way to split strings (words) to individual phones using R, but I haven't been able to come up with a good solution. I am aware that one sollution would be to use gruut-ipa module but I cannot shake the feeling that…
Samsom
  • 1
0
votes
2 answers

Having trouble loading tidyverse library

install.packages("dslabs") install.packages("tidyverse") library(dslabs) library(tidyverse) data(murders)> murders %>% + ggplot(aes (population, total, label=abb, color=region)) + + geom_label() Error in murders %>% ggplot(aes(population,…
0
votes
1 answer

Fail to install older version of stringi using renv

I've been trying to install an older version of stringi using renv and am getting the following error. I've had a few earlier errors using restore, but a few restarts have been able to move things through. I tried removing and reinstalling stringi,…
Corey
  • 405
  • 2
  • 6
  • 18
0
votes
4 answers

Regex: matching for only one number in an integer R

I have an example list of numbers: 888* 8* 8.88* 88.88* 88888.888* 899900 8.89 0.08 80 89899 50 32 30.8 0.081 0.8 8.1 and I only want to match those that have only 8's. I put an asterisk for the ones I only want and the others should be ignored. I…
karuno
  • 391
  • 4
  • 12
0
votes
2 answers

Replace specific character from all columns of a dataframe

Having a dataframe like this: df <- data.frame(id = c(1,2), date1 = c("Nov 2016 Current", "Nov 2016 Current"), date2 = c("Nov 2016 Current", "Nov 2016 Current")) Is there any command to replace this character in…
Erik Brole
  • 315
  • 9
0
votes
2 answers

Wrapping a string extract function in an ifelse statement

The question below is an extension of this question. Example data I have example data as follows: library(data.table) example_dat <- fread("var_nam description some_var this_is_some_var_kg other_var this_is_meters_for_another_var …
Tom
  • 2,173
  • 1
  • 17
  • 44
0
votes
0 answers

Converting printable EBCDIC with replacement for non-printable

I'm doing some analysis on data, a part of which is in hex pairs encoded in EBCDIC, but contains both printable and non-printable characters. What I'm doing at the moment is to apply my own masking translation on the ebcdic hex before translating…
Ian
  • 1,507
  • 3
  • 21
  • 36
0
votes
1 answer

How to remove hidden line breaks in character content?

I have a text like this: "Hello how are you" %>% word(-1) I get an error if I perform this because I pressed Enter after how. How do I remove the hidden line break so as to perform my code on the text?
GiulioGCantone
  • 195
  • 1
  • 10
0
votes
1 answer

how to extract specific character using str_extrac() in R

Context I have a character vector a. I want to extract the text between the last slash(/) and the .nc using the str_extract()function. I have tried like this: str_extract(a, "(?=/).*(?=.nc)"), but failed. Question How can I get the text between the…
zhiwei li
  • 1,635
  • 8
  • 26
0
votes
1 answer

This keeps happening in R every time I try to load packages

So, I keep getting this error: library(tidyverse) Warning: package ‘tidyverse’ was built under R version 4.1.3 Error: package or namespace load failed for ‘tidyverse’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is…
kfeye
  • 13
  • 4
0
votes
1 answer

Same regex behaves differently on grepl versus stri_detect_regex

edit I encounter this on R version 3.6.1, appearently in newer versions this issue does not exist and the functions do behave similar. Consider this vector, where the first element is in the Latin-1 Supplement unicode block, the second element is in…
Merijn van Tilborg
  • 5,452
  • 1
  • 7
  • 22
0
votes
1 answer

Update the last occurrence of a word in a string only if certain condition is TRUE in R Programming

I have a dataframe with two character columns where I want to made the following changes library(stringr) Airport_ID <- c("3001","3002","3003","3004") Airport_Name <- c("Adelaide Airport DTS", "Brisbane DTS Land Airport Land ADTS", "Washington DTS…
BobbyG
  • 45
  • 5
0
votes
1 answer

Select and extract different capture groups from string using regex

I would like to extract various parts of a string using regex patterns and capturing groups. I am able to filter the string using str_match_all, but I would like to have the possibility to explicitely select one of the capturing groups, defined in…
hannes101
  • 2,410
  • 1
  • 17
  • 40