Questions tagged [subset]

A subset consists of those elements selected from a larger set of elements, by their position in the larger set or other features, such as their value.

Definition:

From Wikipedia:

a set A is a subset of a set B, or equivalently B is a superset of A, if A is 'contained' inside B, that is, all elements of A are also elements of B.

Uses:

  • In , subset is a function that selects a subset of elements from a vector, matrix, or data frame, given some logical expression (caution: subset drops incomplete cases; see How to subset data in R without losing NA rows?). However, for programmatic use (as opposed to interactive use) it is better to use the \[ (or [[) operators or the filter function from dplyr. substring is used to find subsets of character strings.
  • In , a subset of an array can be obtained with array[indices].
6799 questions
52
votes
7 answers

Subset with unique cases, based on multiple columns

I'd like to subset a dataframe to include only rows that have unique combinations of three columns. My situation is similar to the one presented in this question, but I'd like to preserve the other columns in my data as well. Here's my example: >…
bosbmgatl
  • 928
  • 3
  • 9
  • 12
48
votes
4 answers

SQL: How To Select Earliest Row

I have a report that looks something like this: CompanyA Workflow27 June5 CompanyA Workflow27 June8 CompanyA Workflow27 June12 CompanyB Workflow13 Apr4 CompanyB Workflow13 Apr9 CompanyB Workflow20 …
dvanaria
  • 6,593
  • 22
  • 62
  • 82
48
votes
3 answers

Return data subset time frames within another timeframes?

There are very nifty ways of subsetting xts objects. For example, one can get all the data for all years, months, days but being strictly between 9:30 AM and 4 PM by doing: my_xts["T09:30/T16:00"] Or you can get all the observations between two…
Alex
  • 19,533
  • 37
  • 126
  • 195
47
votes
10 answers

Find all possible subset combos in an array?

I need to get all possible subsets of an array with a minimum of 2 items and an unknown maximum. Anyone that can help me out a bit? Say I have the following array: [1, 2, 3] How do I get this? [ [1, 2], [1, 3], [2, 3], [1, 2, 3] ]
Stephen Belanger
  • 6,251
  • 11
  • 45
  • 49
45
votes
3 answers

subset a column in data frame based on another data frame/list

I have the following table1 which is a data frame composed of 6 columns and 8083 rows. Below I am displaying the head of this table1: |gene ID | prom_65| prom_66| amast_69| amast_70| …
BCArg
  • 2,094
  • 2
  • 19
  • 37
43
votes
16 answers

Calculating all of the subsets of a set of numbers

I want to find the subsets of a set of integers. It is the first step of "Sum of Subsets" algorithm with backtracking. I have written the following code, but it doesn't return the correct answer: BTSum(0, nums); ///************** ArrayList
Elton.fd
  • 1,575
  • 3
  • 17
  • 24
43
votes
3 answers

Remove highly correlated variables

I have a huge dataframe 5600 X 6592 and I want to remove any variables that are correlated to each other more than 0.99 I do know how to do this the long way, step by step i.e. forming a correlation matrix, rounding the values, removing similar ones…
Error404
  • 6,959
  • 16
  • 45
  • 58
43
votes
1 answer

R: Why is the [[ ]] approach for subsetting a list faster than using $?

I've been working on a few projects that have required me to do a lot of list subsetting and while profiling code I realised that the object[["nameHere"]] approach to subsetting lists was usually faster than the object$nameHere approach. As an…
Jon M
  • 1,157
  • 1
  • 10
  • 16
40
votes
22 answers

Finding all the subsets of a set

I need an algorithm to find all of the subsets of a set where the number of elements in a set is n. S={1,2,3,4...n} Edit: I am having trouble understanding the answers provided so far. I would like to have step-by-step explanation of how the…
Rahul Vyas
  • 28,260
  • 49
  • 182
  • 256
40
votes
5 answers

creating a new list with subset of list using index in python

A list: a = ['a', 'b', 'c', 3, 4, 'd', 6, 7, 8] I want a list using a subset of a using a[0:2],a[4], a[6:], that is I want a list ['a', 'b', 4, 6, 7, 8]
user2783615
  • 829
  • 3
  • 11
  • 17
40
votes
3 answers

How to select some rows with specific rownames from a dataframe?

I have a data frame with several rows. I want to select some rows with specific rownames (such as stu2,stu3,stu5,stu9) from this dataframe. The input example dataframe is as follows: attr1 attr2 attr3 attr4 stu1 0 0 1 0 …
user2405694
  • 847
  • 2
  • 8
  • 19
40
votes
16 answers

Find all subsets of length k in an array

Given a set {1,2,3,4,5...n} of n elements, we need to find all subsets of length k . For example, if n = 4 and k = 2, the output would be {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}. I am not even able to figure out how to start. We don't have…
h4ck3d
  • 6,134
  • 15
  • 51
  • 74
39
votes
8 answers

Subset dataframe by multiple logical conditions of rows to remove

I would like to subset (filter) a dataframe by specifying which rows not (!) to keep in the new dataframe. Here is a simplified sample dataframe: data v1 v2 v3 v4 a v d c a v d d b n p g b d d h c k d c c r p g d v d …
Jota
  • 17,281
  • 7
  • 63
  • 93
38
votes
2 answers

Apply function on a subset of columns (.SDcols) whilst applying a different function on another column (within groups)

This is very similar to a question applying a common function to multiple columns of a data.table uning .SDcols answered thoroughly here. The difference is that I would like to simultaneously apply a different function on another column which is not…
Matt Weller
  • 2,684
  • 2
  • 21
  • 30
38
votes
4 answers

Subset data frame based on number of rows per group

I have data like this, where some "name" occurs more than three times: df <- data.frame(name = c("a", "a", "a", "b", "b", "c", "c", "c", "c"), x = 1:9) name x 1 a 1 2 a 2 3 a 3 4 b 4 5 b 5 6 c 6 7 c 7 8 c 8 9 c 9 I…
SJSU2013
  • 585
  • 3
  • 8
  • 18