Questions tagged [stringi]

stringi is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

r's stringi package provides a platform independent way of manipulating strings. It is built on the icu library and has a syntax inspired by the stringr package.

Repositories

Other resources

Posts on R-bloggers

Related tags

regex
r's stringr

298 questions

votes

1 answer

Replace rules(String pattern matching) in R

I know similar question might have asked in this forum but I feel my requirement is peculiar. I have a data frame with a column with the following values. Below is the just sample and it contains more than 1000 observations Reported Terms "2 Left…

r replace pattern-matching stringr stringi

asked Oct 18 '19 at 09:03

Pavan kumar

votes

0 answers

find and replace text in xml

Trying to edit the value of maxTreeAgeInit="50.0" in an xml file (outline as follows) middle of my xml file (xml version="1.0" encoding="utf-8") of interest:

r xml stringr radix stringi

asked Oct 18 '19 at 04:30

pepdave

votes

1 answer

How do I extract the second number pairing from a character string?

If I have a column with character variables that look like "1000_D_22", "1002M_26", and "1014_17_2/3/2019", how do I strip the characters so that I get "22", "26", and "17"?

r stringr stringi

asked Oct 15 '19 at 21:33

blaze

votes

1 answer

replace parts of a string with a vector

I am having problems with replacing parts of a single string with a set of vector replacements, to result in a vector. I have a string tex which is intended to tell a diagram what text to put as the node (and other) labels. So if tex is "!label has…

r stringr stringi

asked Jun 13 '19 at 10:19

Steve Powell

1,646
16
26

votes

2 answers

Is there an R function for transforming entire df into lower?

I'm setting up a data table & expected to transform all data to be in lower-case, thought it would look neat. How can I do that ?

r uppercase stringr lowercase stringi

asked Apr 17 '19 at 12:34

Mr.KT

votes

0 answers

tokens_replace() only works with stri_trans_general() and not with Encoding()

While playing around with lemmatizing, stopwords removal, stemming etc. for German text, I had problems using the tokens_replace() function in the quanteda package. I found a solution (see code) which seems to work although I do not understand why.…

r encoding quanteda stringi

asked Feb 20 '19 at 12:22

LeaK

votes

1 answer

How to split a text into a vector, where each entry corresponds to an index value assigned to each unique word?

Let's say I have a document with some text, like this, from SO: doc <- 'Questions with similar titles have frequently been downvoted and/or closed. Consider using a title that more accurately describes your question.' I can then make a dataframe…

r dplyr cpu-word stringi

asked Feb 07 '19 at 14:42

Union find

7,759
13
60
111

votes

0 answers

Add conditional whitespace after special character and N additional characters

Cleaning the following web scraped data and getting vectors without proper spacing in consistent places: " SharePriceNAVPremium/Discount" "Current$21.26$20.901.72%" "52 Wk Avg$24.41$23.245.05%" "52 Wk High$28.00$25.0518.09%" "52 Wk…

r data-cleaning stringr stringi

asked Jan 10 '19 at 17:08

js80

votes

2 answers

Add a list column to a dataframe

I have a dataframe with 100 rows I have a column within the dataframe which consists of text. I would like to separate the text column into sentences so that the text column becomes a list of sentences. I am splitting with stringi package function…

r stringi

asked Jan 09 '19 at 16:03

Sebastian Zeki

6,690
11
60
125

votes

2 answers

String replace ignoring characters

I have the following string: string <- c("ABDSFGHIJLKOP") and list of substrings: sub <- c("ABDSF", "SFGH", "GHIJLKOP") I would like to include < and > after each sub match thus getting: I have tried the following code by…

r regex gsub stringr stringi

asked Dec 28 '18 at 13:27

Nivel

votes

3 answers

Splitting coloumn with differing syntax in R

I am having some trouble cleaning up my data. It consists of a list of sold houses. It is made up of the sell price, no. of rooms, m2 and the address. As seen below the address is in one string. Head(DF, 3) Address Price…

r dataframe stringi

asked Aug 27 '18 at 16:30

Thomas

votes

2 answers

Remove everything before a certain occurrence identified by position in string

I have a string looking like a. I would like to delete everything before the 2nd to last occurrence of the patter === test, === included. a <- "=== test : {abc} === test : {abc} === test : {abc} === test : {aUs*} === dce …

r regex string stringi

asked Aug 22 '18 at 17:02

thequietus

votes

1 answer

Why won't ggplot install properly on my machine after an upgrade?

I've had a problem for a while now in which I can't load the stringi package until I install it clean. This seems to work as long as I'm in a single R session. Then, some time later, maybe when I create a new session or probably after a longer…

r ggplot2 stringi

asked Jul 26 '18 at 21:09

Ben Smith

votes

3 answers

Use R to read a text file and format extracted data in to a table

I have a text file in the following basic format which repeats a few thousand times: Patient Name- John Smith Number of dx codes: 123 Number of pr codes: 678 Charges: 910 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis arcu ipsum,…

r text stringi

asked Jun 26 '18 at 16:40

user6340762

votes

1 answer

stri_unescape_unicode() fails on some characters

I have a problem with converting unicode characters in R. I am following this approach, but stri_unescape_unicode from library stringi fails to return correct value in some cases. Let me show an example where the correct value should be word…

r unicode encoding character-encoding stringi

asked Jun 07 '18 at 08:55

pieca

2,463
1
16
34

Prev 1 2 3

…

19 20 Next