Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
8
votes
3 answers

R: how to display the first n characters from a string of words

I have the following string: Getty <- "Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal." I want to display the first 10…
mapleleaf
  • 758
  • 3
  • 8
  • 14
8
votes
1 answer

Why does strsplit return a list

Consider text <- "who let the dogs out" fooo <- strsplit(text, " ") fooo [[1]] [1] "who" "let" "the" "dogs" "out" the output of strsplit is a list. The list's first element then is a vector, that contains the words above. Why does the function…
FooBar
  • 15,724
  • 19
  • 82
  • 171
8
votes
4 answers

split string with regex

I'm looking to split a string of a generic form, where the square brackets denote the "sections" of the string. Ex: x <- "[a] + [bc] + 1" And return a character vector that looks like: "[a]" " + " "[bc]" " + 1" EDIT: Ended up using this: x <-…
Jeff Keller
  • 771
  • 1
  • 7
  • 15
8
votes
2 answers

Regex; eliminate all punctuation except

I have the following regex that splits on any space or punctuation. How can I exclude 1 or more punctuation characters from :punct:? Let's say I'd like to exclude apostrophes and commas. I know I could explicitly use [all punctuation marks in…
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
7
votes
2 answers

How can I split a string "a\n\nb" into c("a", "", "", "b")?

When I stringr::str_split "\n\na\n" by "\n", I obtain c("", "", "a", ""). I expected I can obtain c("a", "", "", "b") when I stringr::str_split "a\n\nb" by "\n", but I obtained c("a", "", "b") instead. How can I obtain c("a", "", "", "b") by…
김용석
  • 71
  • 3
7
votes
2 answers

Split data.frame into groups by column name

I'm new to R. I have a data frame with column names of such type: file_001 file_002 block_001 block_002 red_001 red_002 ....etc' 0.05 0.2 0.4 0.006 0.05 0.3 0.01 0.87 0.56 0.4 …
Keity
  • 143
  • 1
  • 10
7
votes
4 answers

R Split string and keep substrings righthand of match?

How to do this stringsplit() in R? Stop splitting when no first names seperated by dashes remain. Keep right hand side substring as given in results. a <- c("tim/tom meyer XY900 123kncjd", "sepp/max/peter moser VK123 456xyz") # result: c("tim…
Kay
  • 2,702
  • 6
  • 32
  • 48
7
votes
4 answers

Separate "Name" into "FirstName" and "LastName" columns of data frame

I am struggling to figure out how to take a single column of "Name" in a dataframe split it into two other columns of FistName and LastName within the same data frame. The challenge is that some of my Names have several last names. Essentially, I…
RyanL
  • 73
  • 1
  • 1
  • 3
7
votes
7 answers

Use strsplit to get last character in r

I have a file of baby names that I am reading in and then trying to get the last character in the baby name. For example, the file looks like.. Name Sex Anna F Michael M David M Sarah F I read this in using sourcenames =…
CodeLearner
  • 389
  • 2
  • 6
  • 14
7
votes
2 answers

R: splitting a string between two characters using strsplit()

Let's say I have the following string: s <- "ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705" I would like to recover the strings between ";" and "=" to get the following output: [1] "MIMAT0027618" "MIMAT0027618" …
biohazard
  • 2,017
  • 10
  • 28
  • 41
7
votes
3 answers

Speed up `strsplit` when possible output are known

I have a large data frame with a factor column that I need to divide into three factor columns by splitting up the factor names by a delimiter. Here is my current approach, which is very slow with a large data frame (sometimes several million…
Noam Ross
  • 5,969
  • 5
  • 24
  • 40
6
votes
3 answers

Split string with repeated delimiters

I have a string in R in the following form: example <- c("namei1 namej1, surname1, name2, surnamei2 surnamej2, name3, surname3") And I wish to obtain two columns: namei1 namej1 | surname1 name2 | surnamei2 surnamej2 name3 |…
anespinosa
  • 73
  • 3
6
votes
3 answers

Count characters of a section of a string

I have this df: dput(df) structure(list(URLs = c("http://bursesvp.ro//portal/user/_/Banco_Votorantim_Cartoes/0-7f2f5cb67f1-22918b.html", "http://46.165.216.78/.CartoesVotorantim/Usuarios/Cadastro/BV6102891782/",…
Sotos
  • 51,121
  • 6
  • 32
  • 66
6
votes
2 answers

String split on a number word pattern

I have a data frame that looks like this: V1 V2 peanut butter sandwich 2 slices of bread 1 tablespoon peanut butter What I'm aiming to get is: V1 V2 peanut butter sandwich 2 slices of bread peanut…
yokota
  • 1,007
  • 12
  • 23
6
votes
2 answers

R strsplit doesn't split on "."?

I am writing an R script and want to define a variable to be used in plot annotations as part of the file name. I thought I would use the strsplit() function. Here is my code and the output: infile = "ACC_1346.table.txt" x = strsplit(infile,…
Slavatron
  • 2,278
  • 5
  • 29
  • 40
1 2
3
46 47