Questions tagged [subset]

A subset consists of those elements selected from a larger set of elements, by their position in the larger set or other features, such as their value.

Definition:

From Wikipedia:

a set A is a subset of a set B, or equivalently B is a superset of A, if A is 'contained' inside B, that is, all elements of A are also elements of B.

Uses:

  • In , subset is a function that selects a subset of elements from a vector, matrix, or data frame, given some logical expression (caution: subset drops incomplete cases; see How to subset data in R without losing NA rows?). However, for programmatic use (as opposed to interactive use) it is better to use the \[ (or [[) operators or the filter function from dplyr. substring is used to find subsets of character strings.
  • In , a subset of an array can be obtained with array[indices].
6799 questions
2
votes
3 answers

How To Split a Date & Time Observation

I am just learning R and have come up against this. I have the below time series observations, 10/08/2015 02:31:04.450 I want to split the date and the time to separate columns. Do i need need to round the Milliseconds in time? if so how. I have…
Hilly
  • 21
  • 1
2
votes
1 answer

Copy columns of a data frame based on the value of a third column in R

I have a data frame with 4 columns. On one of the columns I added a date so that each value looks like this >print(result[[4]][[10000]]) [[10000]] [1] "Jan" "14" "2012" That means that on the 1000'th field of the 4th column I have these 3…
Atirag
  • 1,660
  • 7
  • 32
  • 60
2
votes
0 answers

Load partial columns values to existing parquet data set

I have a requirement to load the set of column values to existing parquet data set. The specific scenario occurs when the hive table is created extracting data from datawarehouse table with the required fields. During some analysis some more…
maxmithun
  • 1,089
  • 9
  • 18
2
votes
1 answer

R function to filter / subset (programatically) multiple values over one variable

Is there a function that takes one dataset, one col, one operator, but several values to evaluate a condition? v1 <- c(1:3) v2 <- c("a", "b", "c") df <- data.frame(v1, v2) Options to subset (programmatically) result <- df[df$v2 == "a" | df$v2 ==…
jpinelo
  • 1,414
  • 5
  • 16
  • 28
2
votes
3 answers

Brackets make a vector different. How exactly is vector expression evaluated?

I have a data frame as follows: planets type diameter rotation rings Mercury Terrestrial planet 0.382 58.64 FALSE Venus Terrestrial planet 0.949 -243.02 FALSE Earth Terrestrial planet 1.000 1.00 FALSE Mars …
JelenaČuklina
  • 3,574
  • 2
  • 22
  • 35
2
votes
1 answer

R doesn't recognise header from sav file

I'm importing a sav file to RStudio. Now I want to select only a specific nation (column header: nation) and a specific year (column header: year). Using following code: myfile_nation_year <- subset(myfile, (nation == "Great Britain") & (year ==…
2
votes
3 answers

R - keep all columns but only select rows that meet multiple criteria from multiple columns

I have many rows of eye-tracking data (longer fixations and shorter saccades) in my dataframe, each row corresponds to a sample taken by the eye-tracker (see column timestamp). The data proceeds progressively, with counts of fix and sac, their…
Val
  • 53
  • 1
  • 2
  • 6
2
votes
1 answer

R: subset many objects efficiently

I very often use logical vectors to subset other vectors, matrices and data frames (in the genomics field, it's very common). On such vector would be made like so: condition <- myNucleotideVector == "G" then I work on subsets matching that…
jeanlain
  • 382
  • 1
  • 3
  • 13
2
votes
0 answers

R subset does not respond as expected. Right way to write it?

I have a usual 100 by 2 database. I want to keep only values where 1st column ('station') equals 3 and 2nd column ('complete') equals 1. However none of my tries works out: data[data[,2]==1] subset(data,complete=1) data[complete=1] - returns just…
Ilja
  • 611
  • 3
  • 9
  • 19
2
votes
2 answers

removing columns via subset throws unary invalid argument error

I have a dataframe with rather odd column names (it is a combination of several other dataframes). Every time I try to subset and remove a column it gives me an error Error in -c("surveys$gender") : invalid argument to unary operator. Can someone…
Rilcon42
  • 9,584
  • 18
  • 83
  • 167
2
votes
2 answers

advanced row deleting in R

I am looking to do row deleting in R based on advanced selection logic (i.e. not just a simple subset). Here is some sample code and what I need to do v1 <- c(1:11) v2 <- c('a','a','b','b','b','b','c','c','c','c','c') v3 <-…
dgssd
  • 53
  • 6
2
votes
1 answer

xts subsetting gives incorrect results for months

I am using R 3.2.1 for Mac OS X and seem to have run into incorrect behavior in xts subsetting. In brief, subsetting monthly data give a result that is 1 month lagged from what it should be. Here is a simple example that is similar to an analysis…
2
votes
1 answer

lmList diagnostic plots - is it possible to subset data during a procedure or do data frames have to be subset and then passed in?

I am new to R and am trying to produce a vast number of diagnostic plots for linear models for a huge data set. I discovered the lmList function from the nlme package. This works a treat but what I now need is a means of passing in a fraction of…
2
votes
2 answers

R: Find max value for column among a subset of a data frame

I have a dataframe df with columns ID, Year, Value1, Value2, Value3 and 21788928 rows. I need to subset the data by Year and IDand find the max Value1 in that subset saving the rest of information of that row, I need to do that for all the…
Liza
  • 1,066
  • 2
  • 16
  • 26
2
votes
2 answers

Conditional searching which omits NA values

I'm doing a conditional search of part of a dataset that has multiple NA values within each row. Something like this (a preview).. time1 time2 time3 time4 slice1 slice2 slice3 slice4 pt1 1 3 NA NA NA 1 3 …
Avi
  • 53
  • 1
  • 4