Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
0
votes
1 answer

data.table syntax setting colnames and values based on values in other table via tstrsplit

after solving this issue and being still new to data.table I need help with a similar problem but cannot get it to work: I want to create a new dt that has the colnames of DT_1 split by [+-] as colnames DT_1= data.table("t+e+s+t" = c(8),"t+e+s-t" =…
Rivka
  • 307
  • 1
  • 5
  • 19
0
votes
2 answers

R: gsub and str_split_fixed in data.tables

I am "converting" from data.frame to data.table I now have a data.table: library(data.table) DT = data.table(ID = c("ab_cd.de","ab_ci.de","fb_cd.de","xy_cd.de")) DT ID 1: ab_cd.de 2: ab_ci.de 3: fb_cd.de 4: xy_cd.de new_DT<-…
Rivka
  • 307
  • 1
  • 5
  • 19
0
votes
1 answer

which read or scan function to be used to quickly store a dump of parameters from excel

I have an excel file, an extract of which looks like this, after reading it using a plain read.csv(filename). Have copy-pasted first 16 lines of one column of the imported data. [1] UserName, UserPassword, AppVersion, UserId, UUID, AppId, Latitude,…
Lazarus Thurston
  • 1,197
  • 15
  • 33
0
votes
0 answers

Split string at '+'

I have a coordinate in the following format: coordinate <- "43.12+332.11" Since I need to recover individual Latitude and Longitude parts of this coordinate, I wanted to split at the '+'. Running the following code: splitted <-…
bublitz
  • 888
  • 2
  • 11
  • 21
0
votes
2 answers

split string and extract according to a pattern to form data frame

I am trying to slice the following strings as 3 separated columns (Country, City, Count) in R Country City Count Japan Tokyo 361 The data: "country=Japan&city=Tokyo","361" "country=Spain&city=Barcelona","359" "country=United…
tmhs
  • 998
  • 2
  • 14
  • 27
0
votes
2 answers

Unlist multiple values in dataframe column but keep track of the row number

I have a data frame that contains a column with multiple values consisting of gene name synonyms separated by semicolons: score <- c("32.01","19.5","18.0") symbol <- c("30 kDa adipocyte complemen related protein","AAT1","Cachectin") synonym <- c("30…
user1357079
  • 79
  • 3
  • 8
0
votes
1 answer

Error in strsplit(word, NULL) : non-character argument with spell checker

I try to do a spelling checker with R that correct a spelling mistake of a word or a document. I try with this R code to do a correction for a word, which it works very well: > Correct("speling", dtm = counts) $l4 [1] "spelling" but when I try to…
Datackatlon
  • 199
  • 1
  • 4
  • 15
0
votes
1 answer

Splitting a Column According to the Natural Format of its Characters in R

I have the following dataframe: library(rvest) library(XML) library(tidyr) library(zoo) library(chron) library(lubridate) library(stringr) page.201702050atl =…
DataProphets
  • 156
  • 3
  • 17
0
votes
1 answer

multiple ordered strsplit, then recombine

Given a vector of character strings, where each string is a comma-separated list of species names (i.e. Genus species). Each string can have a variable number of species in it (e.g. as shown in the example below, the number of species in a given…
Kevin W
  • 33
  • 5
0
votes
6 answers

R: Extract part of string with varying length

I have a list of strings (very large, millions of rows) from which I want to extract specific parts. I first split the string at the semicolon and then extract to specific sections. It's made a little more complicated as there are sometimes 3,…
ulima2_
  • 1,276
  • 1
  • 13
  • 23
0
votes
0 answers

Delimiting a specific column

I have a similar question to this one below: Split column at delimiter in data frame Except there is a new layer of complexity of having an unknown amount of delimits. I'd like to create new columns with one Item code per column. …
Alan
  • 1
  • 4
0
votes
4 answers

Separate a string by a number

I am trying to separate my column VEHICLE_TYPE by Model and Engine. The code can be a normal SQL or R code. My data looks like this: MODEL VEHICLE_TYPE 77 Bora Bora 1.6 79 Ducato Ducato 15 120 Multijet 80 …
rayray
  • 35
  • 1
  • 8
0
votes
1 answer

R create data table columns dynamically

I have this data table called tmp.df.lhs.denorm which I provided the first 2 rows ahead: > dput(tmp.df.lhs.denorm[1:2]) structure(list(rules = c("{} => {Dental anesthetic products-Injectables cartridges|2288210-Septocaine Cart 4% w/EPI}",…
NRG
  • 149
  • 2
  • 10
0
votes
2 answers

R string split, to normalized (long) format with running index

I have this data frame structure(list(rule.id = c(1, 2), rules = structure(1:2, .Label = c("Lamp1.1,Lamp1.2", "Lamp2.1,Lamp2.2"), class = "factor")), .Names = c("rule.id", "rules"), row.names = c(NA, -2L), class = "data.frame") # rule.id …
NRG
  • 149
  • 2
  • 10
0
votes
1 answer

Transforming list obtained via strsplit to merge common categories

I have a list resembling the one below: # Initial object vec <- c("levelA-1", "levelA-2", "levelA-3", "levelB-1", "levelB-2", "levelB-3") lstVec <- strsplit(x = vec, split = "-") I would like to arrive at a list of the following…
Konrad
  • 17,740
  • 16
  • 106
  • 167