Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
3
votes
6 answers

Split a string by two delimiters only in the first occurrence

I have read many examples here and other forums, tried things myself, but still can´t do what I want: I have a string like this: myString <- c("ENSG00000185561.10|TLCD2", "ENSG00000124785.9|NRN1", "ENSG00000287339.1|RP11-575F12.4") And I want to…
Paula
  • 107
  • 6
3
votes
5 answers

Split df column of integers into individual digits in R

I have a df where one variable is an integer. I'd like to split this column into it's individual digits. See my example below Group Number A 456 B 3 C 18 To Group Number Digit1 Digit2 Digit3 A 456 4 5 6 B 3 3 …
3
votes
1 answer

Duplicate rows based on other columns containing values, then return row with split column value

I have this df that contains rows that need to be duplicated based on number of letters split by '-' in 'Group' column. I want each duplicated row to only contain a single letter from the 'Group' column . XYZ does not have any "-" and would remain…
bLund
  • 55
  • 4
3
votes
3 answers

R: My data frame has 2 columns that have a string of numbers in each row, is there a way to split the string and add the values of each column?

In my data frame in R, I have two columns (A and B). In each of the rows for column A and B there is a string of numbers separated by commas. Row 1, Column A - 1,2,3,4 Row 1, Column B - 5,6,7,8 I want to add the values and create another…
crazytaxi
  • 35
  • 5
3
votes
1 answer

Split string with non greedy regex via strsplit

I am facing a problem with regex and strsplit. I would like to split the following x string based on the second : symbol x <- "26/11/19, 22:16 - Super Mario: It's a me: Super Mario!, but also : the princess" and obtain then something like…
SabDeM
  • 7,050
  • 2
  • 25
  • 38
3
votes
1 answer

In R, how do I split each string in a vector to return everything before the Nth instance of a character?

Example: df <- data.frame(Name = c("J*120_234_458_28", "Z*23_205_a834_306", "H*_39_004_204_99_04902")) I would like to be able to select everything before the third underscore for each row in the dataframe. I understand how to split the string…
Jay
  • 442
  • 1
  • 5
  • 13
3
votes
3 answers

Use str split but based on position of specials character

Hi can anyone help me on this ?? C <- "NURUL AMANI [ID 26378] [IC 971035186514] SYED SAHARR [ID 61839] [IC 981627015412]" str_split(C, "\\]") The result is like this. [1]"NURUL AMANI [ID 26378" " [IC 971035186514" [3]" SYED SAHARR [ID 61839" " [IC…
Wawa
  • 73
  • 9
3
votes
2 answers

R: strsplit on negative lookaround

Say I need to strsplit caabacb into individual letters except when a letter is followed by a b, thus resulting in "c" "a" "ab" "a" "cb". I tried using the following line, which looks OK on regex tester but does not work in R. What did I do…
dasf
  • 1,035
  • 9
  • 16
3
votes
1 answer

Dictionary-like matching on string in R

I have a dataframe in which a string variable is an informal list of elements, that can be split on a symbol. I would like to make operaion on these elements on the basis of another dataset. e.g. task: Calculate the sum of the elements df_1 <-…
MCS
  • 1,071
  • 9
  • 23
3
votes
2 answers

More memory efficient way than strsplit() to split a string into two in R

I have a 1.8m character string, and I need to split it by a 50 character string that appears once very close to the start of the 1.8m character string (about 10k characters in) Using strsplit() errors long_string %>% strsplit(.,…
stevec
  • 41,291
  • 27
  • 223
  • 311
3
votes
3 answers

Parse By nth delimitor in R

I have a dataframe like below: Col1 Col2 A 5!5!!6!!3!!m B 7_8!!6!!7!!t structure(list(Col1 = c("A", "B"), Col2 = c("5!5!!6!!3!!m", "7_8!!6!!7!!t" )), class = "data.frame", row.names = c(NA, -2L)) How do I create a new column that…
nak5120
  • 4,089
  • 4
  • 35
  • 94
3
votes
1 answer

chars <- strsplit(rquote, split = "")[[1]] in R Language

rquote <- "r's internals are irrefutably intriguing" chars <- strsplit(rquote, split = "")[[1]] in the above line of code, what's the meaning of [[1]] ?
Hareesh
  • 31
  • 1
3
votes
3 answers

split string by multiple characters

I would like to split a character by multiple delimiters defined in a vector: text1 <- "aweoiutw839572/)(&2aslk2468" text2 <- "147we547iu5erhg24tzu" dat <- rbind(text1, text2) vector <- c("we", "iu", "24") The result should be: var1 del1…
tobias sch
  • 369
  • 2
  • 15
3
votes
3 answers

R: Count all combinations in a list of strings (Specific Order)

I am trying to count all sequences in a large list of characters delimetered by ">" but only the combinations that are directly next to each other. e.g. given the character…
3
votes
3 answers

How can I split a string and ignore the delimiter if it's "quoted"

Say I have the following string: params <- "var1 /* first, variable */, var2, var3 /* third, variable */" I want to split it using , as a separator, then extract the "quoted substrings", so I get 2 vectors as follow : params_clean <-…
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167