Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
4
votes
3 answers

How to split each of the columns in a data frame to two column?

I have a data frame which is like this(4 rows and 5 column): Marker ind1 ind2 ind3 ind4 mark1 CT TT CT TT mark2 AG AA AG AA mark3 AC …
mahmood
  • 1,203
  • 5
  • 16
  • 27
4
votes
6 answers

extract comma separated strings

I have data frame as below. This is a sample set data with uniform looking patterns but whole data is not very uniform: locationid address 1073744023 525 East 68th Street, New York, NY 10065, USA 1073744022 270 Park Avenue, New…
Cagg
  • 159
  • 1
  • 8
4
votes
6 answers

Split string into 2 letters

I am trying to split a string into 1, 2 and 3 segments. For example, i currently have this: $str = 'test'; $arr1 = str_split($str); foreach($arr1 as $ar1) { echo strtolower($ar1).' '; } Which works well on 1 character splitting, I get: t e s t…
Gixxy22
  • 507
  • 1
  • 8
  • 20
4
votes
3 answers

R: Removing Whitespace + Delimiter

I'm fairly new to the R language. So I have this vector containing the following: > head(sampleVector) [1] "| txt01 | 100 | 200 | 123.456 | 0.12345 |" [2] "| txt02 | 300 | 400 | 789.012 | 0.06789…
12341234
  • 404
  • 5
  • 16
4
votes
2 answers

error in strsplit when trying to separate by a comma

I have the vector length # [1] 15,34, 12,24, 225, # Levels: 12,24, 15,34, 225, and I want to separate them by the comma to eventually make a list of these values Tried: strsplit(length, ",") but keep getting the error message Error in…
Nazrath10R
  • 41
  • 1
  • 1
  • 3
4
votes
1 answer

Duplicating & modifying rows of a dataframe dependent on observations [R]

This is a follow up to this question: Duplicating observations of a dataframe, but also replacing specific variable values in R I have tried to write as succinctly as possible, whilst giving all necessary information. In this current example, I…
jalapic
  • 13,792
  • 8
  • 57
  • 87
4
votes
2 answers

Split string by words in R

I would like to split a string by two words: s <- "PCB153 treated HepG2 cells at T18" strsplit(s, split = ) What should I write instead of <>? I would get: "PCB153" "HepG2 cells" "T18"
charisz
  • 302
  • 4
  • 12
4
votes
2 answers

best way to manipulate strings in big data.table

I have a 67MM row data.table with people names and surname separated by spaces. I just need to create a new column for each word. Here is an small subset of the data: n <- structure(list(Subscription_Id = c("13.855.231.846.091.000",…
marbel
  • 7,560
  • 6
  • 49
  • 68
4
votes
3 answers

split string without loss of characters

I wish to split strings at a certain character while retaining that character in the second resulting string. I can achieve almost all of the desired operation, except that I lose the characters I specify in strsplit, which I guess is called the…
Mark Miller
  • 12,483
  • 23
  • 78
  • 132
3
votes
5 answers

Add a function to Matlab path

I am trying to add the strsplit function to my MATLAB path, but I don't know how to do it. Link : strsplit function I am trying to use the function for my work, but somehow that function does not exist in my version of MATLAB that i currently have.
Jeiman
  • 1,121
  • 9
  • 27
  • 50
3
votes
1 answer

How do I return a specific substring within a Pandas dataframe

I have a column of text that I need to find the substring and return the whole word, but can't figure out how to get the entire word. Each column has text with a coding at the bottom labelled "ATT03", "ATT04" etc and I want to take that ATT and make…
JLondon
  • 31
  • 2
3
votes
1 answer

How does strsplit work on fixed elements with the splitter at the end of the string to split?

I was working on a language parser and I wanted to count certain string elements (say "") in a larger string. Since the string has been cleansed (str.trim), it doesn't have any content after it. I was getting some weird behavior on strsplit as…
mshaffer
  • 959
  • 1
  • 9
  • 19
3
votes
3 answers

Removal of specific string and anything after

I would like to remove 'sat' and anything after but not 'saturday'. Although this seems quite simple I have been unable to find a thread on this. Example: text <- c("good morning amer sat","this morning saturday") Desired Result: "good morning…
megmac
  • 547
  • 2
  • 11
3
votes
1 answer

Apply strsplit by conditional

I tried to apply the below rules: Chop the string by ; to reach maximum length n. For example, n <- 4 string <- c("a;a;aabbbb;ccddee;ff") output <- c("a;a;", "aabb", "bb;", "ccdd", "ee;", "ff") For "aabb", since the chop length "aabbbb" exceed n =…
Howard
  • 61
  • 4
3
votes
3 answers

unlist and split a column to add to rows without losing information of other column in R

I have a column with special character ";" that I want to separate into new rows without loosing information of other columns. let's say, this is my df: col1 year A;B 2010 A 2010 B 2011 B;C 2012 the desired result: col1 year A …
Cina
  • 9,759
  • 4
  • 20
  • 36