Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
0
votes
2 answers

r remove character before space in a string

I have a column within a dataframe where some values are like this Col1 Y 183.21 500.23 432.89 Y 428.29 Y500 I am looking for a way to remove only those Y prior to those strings that have Y and some characters separated by a…
Kim Jenkins
  • 438
  • 3
  • 17
0
votes
1 answer

How to loop a strplit over multiple columns in R

I have an assignment for stat computing, and now I get stuck on something you all probably think pretty easy, I won't ask you to solve the whole thing for me, however this is the problem: I have a data frame with multiple columns I need to slip…
Good2Bi
  • 1
  • 1
0
votes
1 answer

splitting vector by regular expression into dataframe

I have a vector that looks like this head(val) [1] "PD2323 [403-407]" "P05230 [455-459]" I would like to split it into a dataframe with 3 columns and many rows. The output should look something like this: head(output) [,1] …
user3067923
  • 437
  • 2
  • 14
0
votes
1 answer

Splitting a string few characters after the delimiter

I have a large data set of names and states that I need to split. After splitting, I want to create new rows with each name and state. My data strings are in multiple lines that look like this "Peter Johnson, IN Chet Charles, TX Ed Walsh, AZ" "Ralph…
Kavita
  • 11
  • 2
0
votes
2 answers

in R split a column in a dataframe with different length

I am trying to split a column Awards in a dataframe but the column when split returns different number of results , how do I bind it back to the original dataframe: SAMPLE DF: Name Value Awards 1 A1 NA 3 wins. 2 …
E B
  • 1,073
  • 3
  • 23
  • 36
0
votes
0 answers

R long data frame and efficient way to read columns in numeric format

I am working with word2vec data available at Spanish Billion Words Corpus and Embeddings The dataset looks like this v1 v2 v3 once 0.1 0.2 upon 0.3 0.4 a 0.5 0.6 time 0.7 0.8 ... + thousands of lines and columns ... This is my code to read…
pachadotdev
  • 3,345
  • 6
  • 33
  • 60
0
votes
2 answers

Split String after first character

I have a column in a data frame like so: D0.5 A4 C1.3 B2.0 I want to be able to split the column so that the first entry (which is always a single character) is separated from the rest of the entry (which is always numeric, but is of different…
User247365
  • 665
  • 2
  • 11
  • 27
0
votes
3 answers

Extracting a number of a string of varying lengths

Pretend I have a vector: testVector <- c("I have 10 cars", "6 cars", "You have 4 cars", "15 cars") Is there a way to go about parsing this vector, so I can store just the numerical values: 10, 6, 4, 15 If the problem were just "15 cars" and "6…
Sheila
  • 2,438
  • 7
  • 28
  • 37
0
votes
1 answer

Error: missing value where True/False

I am trying to delete all values in a list that have the tag ".dsw". My list is a list of files using the function list.files. This is my code: for (file in GRef) { if (strsplit(file, "[.]")[[1]][3] == "dsw") { #GRef=GRef[-file] for(n in…
naveace
  • 65
  • 1
  • 8
0
votes
2 answers

making dataframe by combining columns with lists with missing data, strsplit, without an index

Apologies if this is obvious, I've found something for when there's an index or for when columns are missing. But I don't think either will work for this. Example data: df.test=data.frame( A=c("n,n,y,n" ,"t", "j,k,k") …
john
  • 21
  • 5
0
votes
2 answers

Replace loop with apply function

I'm using a for loop to create a document term matrix. My actual problem uses an obscure package called RMeCab to tokenize Japanese text, but here a more standard equivalent using strsplit. My current code: Documents <- data.frame(Names=…
Mark R
  • 775
  • 1
  • 8
  • 23
0
votes
1 answer

how to transform rows to columns in R

I would like to know how to transform rows to columns for the following dataset. School class Avg Subavg Sub ABC 2 25.3 17.2 Geo ABC 2 25.3 18.2 Mat ABC 2 25.3 20.2 Fre ABC 3 21.2 17.2 Geo ABC 3 21.2 …
Ram
  • 185
  • 3
  • 12
0
votes
0 answers

Extract texts from a large character string based a pattern

I have a large string of characters and would like to extract certain information from it matching pattern: str(input) chr [1:109094] "{'asin': '0981850006', 'description': 'Steven Raichlen\'s Best of Barbecue Primal Grill DVD. The first three…
vanja_65
  • 101
  • 2
  • 11
0
votes
2 answers

One Hot Encoding of complex variables

I have a dataset where all my data is categorical and I would like to use one hot encoding for further analysis. Main issues I would like to resolve: Some cells contain many text in one cell (an example will follow). Some numerical values need to…
Boro Dega
  • 393
  • 1
  • 3
  • 13
0
votes
0 answers

R list into Matrix

I'm working with a list, in which I've split strings, so each elements can have different numbers split charaters. I'd like this is a matrix, in which if missing characters, should be null. test_data <- cbind(c("laptop", "lenovo", "apple watch",…
jKraut
  • 2,325
  • 6
  • 35
  • 48