Questions tagged [stringi]

stringi is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

's stringi package provides a platform independent way of manipulating strings. It is built on the library and has a syntax inspired by the package.

Repositories

Other resources

Related tags

298 questions
1
vote
1 answer

How do I extract a word, that is contained in a group/list of words, from a string?

From a character vector of strings x <- c("Point to Point Movement, Route/Network Building", "Betting/Wagering, Dice Rolling, Roll / Spin and Move", "Hand…
Nip
  • 387
  • 4
  • 11
1
vote
0 answers

Converting journal titles to their abbreviated form

Good morning my hero! I have a list of journal titles in English, Spanish and Portuguese that I want to convert to their abbreviated form. The official abbreviation dictionary for journal titles is the List of Title Word Abbreviations found on the…
Amalia
  • 43
  • 6
1
vote
1 answer

Match on substring and other variables

I am trying to merge two dataset on key values and string patterns. Basically, I would like a function to count the number of sub-string matching occurrences, conditional on other key variables matching across two db. Across two datasets, base and…
MCS
  • 1,071
  • 9
  • 23
1
vote
2 answers

Recursive stringi commands

I am cleaning some string data using some stringi functions as part of a pipe. I would like these functions to be recursive, so that they tackle all the possible occurrences of a re, not only the first one. I cannot predict ex ante the number of…
MCS
  • 1,071
  • 9
  • 23
1
vote
0 answers

I want to install semPlot packages in R

if i do install.packages("semPlot") in R, "C:/rtools40/mingw32/bin/"g++ -std=gnu++11 -I"C:/PROGRA~1/R/R-41~1.0/include" -DNDEBUG -I. -Iicu69/ -Iicu69/unicode -Iicu69/common -Iicu69/i18n -DU_STRINGI_PATCHES -DUCONFIG_USE_LOCAL…
hohohaha
  • 21
  • 1
  • 2
1
vote
1 answer

Replace backslash to double backslash in special character

How can I replace the backslash in "\u2655" (greater equal sign) to "\\u2655"? I've tried the following: str_replace_all("\u2265", "\\\\", "\\\\\\") stri_replace_all_fixed("\u2265", "\\", "\\\\") Both lead to "≥" which is not "\u2265".
TobiSonne
  • 1,044
  • 7
  • 22
1
vote
1 answer

Error installing the stringi package on R 4 on Linux Ubuntu

I'm trying to install the roxygen2 package on R 4.0.3 on Linux Ubuntu 16.04.7 LTS. It fails because it needs the stringi package to be installed first; I try to install it with the usual command install.packages("stringi") but it fails again and I…
DavideChicco.it
  • 3,318
  • 13
  • 56
  • 84
1
vote
3 answers

counting delimited unique strings in a data frame in R

I have a data frame as follows: a <- c(1, 2, 3, 4) b <- c("AA; AA; BC", "BC; DE", "AA; BC; BC", "DE; DE") df <- data.frame(a,b) I want to count the number of unique two-letter combinations in each string in column b. So the correct answer would be…
user237554
  • 79
  • 8
1
vote
1 answer

R: Transform Cyrillic Unicode to Latin Text

I have some unicode text gathered from a website which in Cyrillic using R selenium, the language is Serbian. A sample of the unicode text is in this form:
Roberto
  • 181
  • 8
1
vote
0 answers

get only first row matches a specific string - R

I have a tibble in R. tibble [9,760,576 x 2] (S3: tbl_df/tbl/data.frame) $ word: chr [1:9760576] "dont know" "years ago" "im sure" "years old" ... $ n : int [1:9760576] 7240 5127 5068 5017 ... Executing the following code gets the rows…
1
vote
0 answers

R fastest way to convert an entire data table to lower case

I have a function written as: setlower <- function(df) { for(j in seq_along(df)){ data.table::set(df, j=j, value=stringi::stri_trans_tolower(df[[j]])) } invisible(df) } I have a much larger package that calls this function on multiple…
Dylan Russell
  • 936
  • 1
  • 10
  • 29
1
vote
1 answer

matching strings regex exact match - special characters

Following on from a solved thread here: matching strings regex exact match (with a bit thank-you to @Onyambu for the updated code). I need to match strings exactly - even if there are special characters. Note - apologies this is the third question…
Keelin
  • 367
  • 1
  • 10
1
vote
1 answer

Why is lapply not forwarding additional arguments?

I have a big Dataset of Tweets where every row is one unique Tweet and I have a list of Keywords which I want to extract from these Tweets if one or more of them are present in the variable text. This List of Keywords has been compiled into a…
LTribe
  • 37
  • 4
1
vote
3 answers

Extracting strings from links using regex in R

I have a list of url links and i want to extract one of the strings and save them in another variable. The sample data is below: sample<- c("http://dps.endavadigital.net/owgr/doc/content/archive/2009/owgr01f2009.pdf", …
user86907
  • 817
  • 9
  • 21
1
vote
2 answers

Count alphabetic to numeric and numeric to alphabetic transitions

I have some data, now I have to count the number of transitions from alphabetic to numeric OR from numeric to alphabetic. dd <-…