Questions tagged [subset]

A subset consists of those elements selected from a larger set of elements, by their position in the larger set or other features, such as their value.

Definition:

From Wikipedia:

a set A is a subset of a set B, or equivalently B is a superset of A, if A is 'contained' inside B, that is, all elements of A are also elements of B.

Uses:

  • In , subset is a function that selects a subset of elements from a vector, matrix, or data frame, given some logical expression (caution: subset drops incomplete cases; see How to subset data in R without losing NA rows?). However, for programmatic use (as opposed to interactive use) it is better to use the \[ (or [[) operators or the filter function from dplyr. substring is used to find subsets of character strings.
  • In , a subset of an array can be obtained with array[indices].
6799 questions
86
votes
3 answers

Subset / filter rows in a data frame based on a condition in a column

Given a data frame "foo", how can I select only those rows from "foo" where e.g. foo$location = "there"? foo = data.frame(location = c("here", "there", "here", "there", "where"), x = 1:5, y = 6:10) foo # location x y # 1 here 1 6 # 2 …
wishihadabettername
  • 14,231
  • 21
  • 68
  • 85
85
votes
1 answer

Undefined columns selected when subsetting data frame

I have a data frame, str(data) to show more about my data frame the result is the following: > str(data) 'data.frame': 153 obs. of 6 variables: $ Ozone : int 41 36 12 18 NA 28 23 19 8 NA ... $ Solar.R: int 190 118 149 313 NA NA 299 99 19 194…
CreamStat
  • 2,155
  • 6
  • 27
  • 43
85
votes
10 answers

Subset data to contain only columns whose names match a condition

Is there a way for me to subset data based on column names starting with a particular string? I have some columns which are like ABC_1 ABC_2 ABC_3 and some like XYZ_1, XYZ_2,XYZ_3 let's say. How can I subset my df based only on columns containing…
user2724207
77
votes
7 answers

Subsetting R data frame results in mysterious NA rows

I've been encountering what I think is a bug. It's not a big deal, but I'm curious if anyone else has seen this. Unfortunately, my data is confidential, so I have to make up an example, and it's not going to be very helpful. When subsetting my…
chrisg
  • 801
  • 2
  • 8
  • 5
73
votes
10 answers

Subset and ggplot2

I have a problem to plot a subset of a data frame with ggplot2. My df is like: df = data.frame(ID = c('P1', 'P1', 'P2', 'P2', 'P3', 'P3'), Value1 = c(100, 120, 300, 400, 130, 140), Value2 = c(12, 13, 11, 16, 15,…
matteo
  • 4,683
  • 9
  • 41
  • 77
70
votes
10 answers

best way to pick a random subset from a collection?

I have a set of objects in a Vector from which I'd like to select a random subset (e.g. 100 items coming back; pick 5 randomly). In my first (very hasty) pass I did an extremely simple and perhaps overly clever solution: Vector itemsVector =…
Tom
  • 803
  • 1
  • 7
  • 5
66
votes
5 answers

subsetting a Python DataFrame

I am transitioning from R to Python. I just began using Pandas. I have an R code that subsets nicely: k1 <- subset(data, Product = p.id & Month < mn & Year == yr, select = c(Time, Product)) Now, I want to do similar stuff in Python. this is what I…
user1717931
  • 2,419
  • 5
  • 29
  • 40
64
votes
14 answers

How to find all subsets of a set in JavaScript? (Powerset of array)

I need to get all possible subsets of an array. Say I have this: [1, 2, 3] How do I get this? [], [1], [2], [3], [1, 2], [2, 3], [1, 3], [1, 2, 3] I am interested in all subsets. For subsets of specific length, refer to the following…
le_m
  • 19,302
  • 9
  • 64
  • 74
63
votes
1 answer

R Not in subset

Possible Duplicate: Standard way to remove multiple elements from a dataframe I know in R that if you are searching for a subset of another group or matching based on id you'd use something like subset(df1, df1$id %in% idNums1) My question is…
screechOwl
  • 27,310
  • 61
  • 158
  • 267
60
votes
6 answers

Selecting columns in R data frame based on those *not* in a vector

I'm familiar with being able to extract columns from an R data frame (or matrix) like so: df.2 <- df[, c("name1", "name2", "name3")] But can one use a ! or other tool to select all but those listed columns? For background, I have a data frame with…
Hendy
  • 10,182
  • 15
  • 65
  • 71
58
votes
2 answers

Extract matrix column values by matrix column name

Is it possible to get a matrix column by name from a matrix? I tried various approaches such as myMatrix["test", ] but nothing seems to work.
Suraj
  • 35,905
  • 47
  • 139
  • 250
58
votes
10 answers

Difference between subarray, subset & subsequence

I'm a bit confused between subarray, subsequence & subset if I have {1,2,3,4} then subsequence can be {1,2,4} OR {2,4} etc. So basically I can omit some elements but keep the order. subarray would be( say subarray of size 3) {1,2,3} {2,3,4}…
user2821242
  • 1,041
  • 3
  • 9
  • 16
55
votes
6 answers

Difference between subset and filter from dplyr

It seems to me that subset and filter (from dplyr) are having the same result. But my question is: is there at some point a potential difference, for ex. speed, data sizes it can handle etc? Are there occasions that it is better to use one or the…
Ruthger Righart
  • 4,799
  • 2
  • 28
  • 33
55
votes
4 answers

Subset rows in a data frame based on a vector of values

I have two data sets that are supposed to be the same size but aren't. I need to trim the values from A that are not in B and vice versa in order to eliminate noise from a graph that's going into a report. (Don't worry, this data isn't being…
Zelbinian
  • 3,221
  • 5
  • 20
  • 23
52
votes
18 answers

find all subsets that sum to a particular value

Given a set of numbers: {1, 3, 2, 5, 4, 9}, find the number of subsets that sum to a particular value (say, 9 for this example). This is similar to subset sum problem with the slight difference that instead of checking if the set has a subset that…
Darth.Vader
  • 5,079
  • 7
  • 50
  • 90