Questions tagged [subset]

A subset consists of those elements selected from a larger set of elements, by their position in the larger set or other features, such as their value.

Definition:

From Wikipedia:

a set A is a subset of a set B, or equivalently B is a superset of A, if A is 'contained' inside B, that is, all elements of A are also elements of B.

Uses:

  • In , subset is a function that selects a subset of elements from a vector, matrix, or data frame, given some logical expression (caution: subset drops incomplete cases; see How to subset data in R without losing NA rows?). However, for programmatic use (as opposed to interactive use) it is better to use the \[ (or [[) operators or the filter function from dplyr. substring is used to find subsets of character strings.
  • In , a subset of an array can be obtained with array[indices].
6799 questions
29
votes
1 answer

Split/subset a data frame by factors in one column

My data is like this (for example): ID Rate State 1 24 AL 2 35 MN 3 46 FL 4 34 AL 5 78 MN 6 99 FL Data: structure(list(ID = 1:6, Rate = c(24L, 35L, 46L, 34L, 78L, 99L), State = structure(c(1L, 3L, 2L, 1L, 3L,…
titi
  • 609
  • 2
  • 7
  • 9
28
votes
2 answers

How do I select rows by two criteria in data.table in R

Let's say I have a data.table and I want to select all the rows where the variable x has a value of b. That is easy library(data.table) DT <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9) setkey(DT,x) # set a 1-column…
Farrel
  • 10,244
  • 19
  • 61
  • 99
28
votes
2 answers

Using multiple criteria in subset function and logical operators

If I want to select a subset of data in R, I can use the subset function. I wanted to base an analysis on data that that was matching one of a few criteria, e.g. that a certain variable was either 1, 2 or 3. I tried myNewDataFrame <-…
JanD
  • 7,230
  • 3
  • 23
  • 24
26
votes
2 answers

How to check whether the elements of an ArrayList are all contained in another ArrayList

How can I easily check to see whether all the elements in one ArrayList are all elements of another ArrayList?
troyal
  • 2,499
  • 6
  • 25
  • 28
25
votes
2 answers

R subsetting a data frame into multiple data frames based on multiple column values

I am trying to subset a data frame, where I get multiple data frames based on multiple column values. Here is my example >df v1 v2 v3 v4 v5 A Z 1 10 12 D Y 10 12 8 E X 2 12 15 A Z 1 10 …
Rachit Agrawal
  • 3,203
  • 10
  • 32
  • 56
25
votes
1 answer

Why does R use partial matching?

I know that for a list, partial matching is done when indexing using the basic operators $ and [[. For example: ll <- list(yy=1) ll$y [1] 1 But I am still an R newbie and this is new for me, partial matching of function arguments: h <-…
agstudy
  • 119,832
  • 17
  • 199
  • 261
25
votes
2 answers

How to define the subset operators for a S4 class?

I am having trouble figuring out the proper way to define the [, $, and [[ subset operators for an S4 class. Can anyone provide me with a basic example of defining these three for an S4 class?
Kyle Brandt
  • 26,938
  • 37
  • 124
  • 165
24
votes
3 answers

how do I grep in R?

I would like to choose rows based on the subsets of their names, for example If I have the following data: data <- structure(c(91, 92, 108, 104, 87, 91, 91, 97, 81, 98), .Names = c("fee-", "fi", "fo-", "fum-", "foo-", "foo1234-", "123foo-",…
David LeBauer
  • 31,011
  • 31
  • 115
  • 189
24
votes
4 answers

Remove group from data.frame if at least one group member meets condition

I have a data.frame where I'd like to remove entire groups if any of their members meets a condition. In this first example, if the values are numbers and the condition is NA the code below works. df <- structure(list(world = c(1, 2, 3, 3, 2, NA, 1,…
nofunsally
  • 2,051
  • 6
  • 35
  • 53
24
votes
3 answers

How to subset a data frame by taking only the Non NA values of 2 columns in this data frame

I am trying to subset a data frame by taking the integer values of 2 columns om my data frame Subs1<-subset(DATA,DATA[,2][!is.na(DATA[,2])] & DATA[,3][!is.na(DATA[,3])]) but it gives me an error : longer object length is not a multiple of shorter…
EnginO
  • 321
  • 3
  • 4
  • 8
24
votes
3 answers

Subset a dataframe by multiple factor levels

How can I avoid using a loop to subset a dataframe based on multiple factor levels? In the following example my desired output is a dataframe. The dataframe should contain the rows of the original dataframe where the value in "Code" equals one of…
Walter
  • 2,811
  • 2
  • 21
  • 23
23
votes
4 answers

Generate all subsets of size k (containing k elements) in Python

I have a set of values and would like to create list of all subsets containing 2 elements. For example, a source set ([1,2,3]) has the following 2-element subsets: set([1,2]), set([1,3]), set([2,3]) Is there a way to do this in python?
John Manak
  • 13,328
  • 29
  • 78
  • 119
23
votes
5 answers

Ruby Hash check is subset?

How can I tell if if a Ruby hash is a subset of (or includes) another hash? For example: hash = {a: 1, b: 2, c: 3, d: 4, e: 5, f: 6, g: 7} hash.include_hash?({}) # true hash.include_hash?({f: 6, c: 3}) # true hash.include_hash?({f: 6, c:…
ma11hew28
  • 121,420
  • 116
  • 450
  • 651
22
votes
5 answers

How to replace NA with mean by group / subset?

I have a dataframe with the lengths and widths of various arthropods from the guts of salamanders. Because some guts had thousands of certain prey items, I only measured a subset of each prey type. I now want to replace each unmeasured individual…
djhocking
  • 1,072
  • 3
  • 16
  • 28
22
votes
4 answers

Randomly sample a percentage of rows within a data frame

Related to this question. gender <- c("F", "M", "M", "F", "F", "M", "F", "F") age <- c(23, 25, 27, 29, 31, 33, 35, 37) mydf <- data.frame(gender, age) mydf[ sample( which(mydf$gender=='F'), 3 ), ] Instead of selecting a number of rows (3 in…
ATMathew
  • 12,566
  • 26
  • 69
  • 76