Questions tagged [stringi]

stringi is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

's stringi package provides a platform independent way of manipulating strings. It is built on the library and has a syntax inspired by the package.

Repositories

Other resources

Related tags

298 questions
2
votes
2 answers

Regex to convert time equations to R date-time (POSIXct)

I'm reading in data from another platform where a combination of the strings listed below is used for expressing timestamps: \* = current time t = current day (00:00) mo = month d = days h = hours m = minutes For example, *-3d is current time…
Gautam
  • 2,597
  • 1
  • 28
  • 51
2
votes
0 answers

Clean up very long list of company names - apply function on each row of data.table

I have a data.table with company names and address information. I want to remove legal entities and the most common words from the company name. Therefore I wrote a function and apply this to my data.table. search_for_default <- c("inc", "corp",…
Judy
  • 35
  • 5
2
votes
1 answer

Problem with install.packages ("stringi") on R

On windows 10, RStudio. I've tried using the command install.packages, but it aways appear this messages: > install.packages ("stringi") There is a binary version available but the source version is later: binary source…
2
votes
2 answers

Matching strings loop over multiple columns

I have data from an open ended survey. I have a comments table and a codes table. The codes table is a set of themes or strings. What I am trying to do: Check to see if a word / string exists from the relevant column in the codes table is in an…
Keelin
  • 367
  • 1
  • 10
2
votes
2 answers

summarize and spread by almost identical strings

I started with several raw df's with similar items ,cleaned and merged to a long format which i later combine to wide format using dplyr... However, i'm left with duplicates because i'm dealing with almost identical strings, can anyone please…
Hammao
  • 801
  • 1
  • 9
  • 28
2
votes
0 answers

stringi::stri_unescape_unicode() for longer / rarer unicode characters?

We see stringi::stri_unescape_unicode() works for the first two unicode characters, but seemingly not the third library(stringi) library(dplyr) c("\\uFFE6", "\\u1ECB0", "\\u11FE0") %>% stringi::stri_unescape_unicode() # [1] "₩" "ị0" …
stevec
  • 41,291
  • 27
  • 223
  • 311
2
votes
2 answers

How to replace characters in a string vector based on a position vector in R?

For an example: set.seed(123) library(stringi) df<-data.frame(p=sprintf("%s", stri_rand_strings(11, 11, '[A-Z]')), n=sample(1:10, 11, 1), s=sprintf("%s", stri_rand_strings(11, 1, '[A-Z]'))) df p n s 1 …
David Z
  • 6,641
  • 11
  • 50
  • 101
2
votes
3 answers

How to get rows with elements in the list column

I have a datatable as below: library(data.table) dt <- data.table( id = c(1:3), string = list(c("tree", "house", "star"), c("house", "tree", "dense forest"), c("apple", "orange", "grapes")) ) From this I…
Ricky
  • 2,662
  • 5
  • 25
  • 57
2
votes
1 answer

icudt error while installing stringi package from r in linux offline

I have downloaded stringi_1.4.3.tar.gz package in my System (RedHat Linux 7), but when I am trying to install offline it I am getting error as below: Execution halted *** icudt download failed. stopping. ERROR: configuration failed for package…
Aparna
  • 33
  • 1
  • 3
2
votes
2 answers

stringr function to concatenate vector of words separated by comma with "and" before last word

I know I can easily write one, but does anyone know if stringr (or stringi) already has a function that concatenates a vector of one or more words separated by commas, but with an "and" before the last word?
code_cowboy
  • 596
  • 5
  • 18
2
votes
0 answers

How to eliminate security concern while accessing to the file through program in R in Windows?

While accessing to CSV file from disk with the help of the R program, where a path to the CSV file is provided in the configuration file ( A path is like "testData/Amazon S3/Inventory/Accounts.csv" which is provided in the Configuration file and…
Bhavneet sharma
  • 337
  • 5
  • 16
2
votes
1 answer

Unlist str_locate_all into separate start and end lists

I use str_locate_all to get the start and end positions of a list of patterns in my string. It returns a list with the start and stop position for each match. How can I get the start and stop positions of all matches into separate lists?…
Nivel
  • 629
  • 4
  • 12
2
votes
1 answer

How to install package stringi in R on Windows 10?

I just installed R and RStudio so its the latest version. I am trying to install Rattle but I get error for stringi, so I try to install stringi but I get the following error: Installing package into ‘C:/Users/.../Documents/R/win-library/3.5’ (as…
MRM
  • 1,099
  • 2
  • 12
  • 29
2
votes
4 answers

Fast way for String matching and replacement from another dataframe in R

I have two dataframes that look like this (although the first one is over 90 million rows long and the second dataframe is a little over 14 million rows) Also the second dataframe is randomly ordered df1 <- data.frame( datalist =…
Kayla
  • 87
  • 9
2
votes
2 answers

R - Grepl vector over vector

I have a vector of character strings (v1) like so: > head(v1) [1] "do_i_need_to_even_say_it_do_i_well_here_i_go_anyways_chris_cornell_in_chicago_tonight" [2] "going_to_see_harry_sunday_happiness" …
Christopher Costello
  • 1,186
  • 2
  • 16
  • 30