Questions tagged [subset]

A subset consists of those elements selected from a larger set of elements, by their position in the larger set or other features, such as their value.

Definition:

From Wikipedia:

a set A is a subset of a set B, or equivalently B is a superset of A, if A is 'contained' inside B, that is, all elements of A are also elements of B.

Uses:

  • In , subset is a function that selects a subset of elements from a vector, matrix, or data frame, given some logical expression (caution: subset drops incomplete cases; see How to subset data in R without losing NA rows?). However, for programmatic use (as opposed to interactive use) it is better to use the \[ (or [[) operators or the filter function from dplyr. substring is used to find subsets of character strings.
  • In , a subset of an array can be obtained with array[indices].
6799 questions
37
votes
8 answers

how to remove multiple columns in r dataframe?

I am trying to remove some columns in a dataframe. I want to know why it worked for a single column but not with multible columns e.g. this works album2[,5]<- NULL this doesn't work: album2[,c(5:7)]<- NULL Error in `[<-.data.frame`(`*tmp*`, , 5:7,…
Ahmed Elmahy
  • 479
  • 1
  • 4
  • 6
37
votes
8 answers

Subset a dataframe between 2 dates

I am working with daily returns from a Brazilian Index (IBOV) since 1993, I am trying to figure out the best way to subset for periods between 2 dates. The data frame (IBOV_RET) is as follows : head(IBOV_RET) DATE 1D_RETURN 1 1993-04-28…
RiskTech
  • 1,135
  • 3
  • 13
  • 19
36
votes
3 answers

How to remove rows of a matrix by row name, rather than numerical index?

I have matrix g: > g[1:5,1:5] rs7510853 rs10154488 rs12159982 rs2844887 rs2844888 NA06985 "CC" "CC" "CC" "CC" "CC" NA06991 "CC" "CC" "CC" "CC" "CC" NA06993 "CC" "CC" "CC" …
JoshDG
  • 3,871
  • 10
  • 51
  • 85
36
votes
6 answers

jq: selecting a subset of keys from an object

Given an input json string of keys from an array, return an object with only the entries that had keys in the original object and in the input array. I have a solution but I think that it isn't elegant ({($k):$input[$k]} feels especially clunky...)…
Jon
  • 1,785
  • 2
  • 19
  • 33
36
votes
1 answer

Understanding .I in data.table in R

I was playing around with data.table and I came across a distinction that I'm not sure I quite understand. Given the following dataset: library(data.table) set.seed(400) DT <- data.table(x = sample(LETTERS[1:5], 20, TRUE), key = "x"); DT Can you…
black_sheep07
  • 2,308
  • 3
  • 26
  • 40
35
votes
2 answers

Filtering a data frame on a vector

I have a data frame df with an ID column eg A,B,etc. I also have a vector containing certain IDs: L <- c("A", "B", "E") How can I filter the data frame to get only the IDs present in the vector? Individually, I would use subset(df, ID == "A") but…
adam.888
  • 7,686
  • 17
  • 70
  • 105
35
votes
5 answers

Subsetting a 2D numpy array

I have looked into documentations and also other questions here, but it seems I have not got the hang of subsetting in numpy arrays yet. I have a numpy array, and for the sake of argument, let it be defined as follows: import numpy as np a =…
Vahid S. Bokharaie
  • 937
  • 1
  • 9
  • 25
35
votes
2 answers

Using grep to help subset a data frame

I am having trouble subsetting my data. I want the data subsetted on column x, where the first 3 characters begin G45. My data frame: x <- c("G448", "G459", "G479", "G406") y <- c(1:4) My.Data <- data.frame (x,y) I have tried: subset (My.Data,…
Stewart Wiseman
  • 675
  • 2
  • 7
  • 14
35
votes
6 answers

Creating subset of a Set in Java

I have a LinkedHashSet, i.e an ordered set. I'm trying to find a function to just return a subset of the set, i.e the first 20 elements of the set. I know I can do it by creating a new set and then populating using an iteration of the first set but…
Paul Taylor
  • 13,411
  • 42
  • 184
  • 351
34
votes
3 answers

Subsetting data.table set by date range in R

I have a large dataset in data.table that I'd like to subset by a date range. My data set looks like this: testset <- data.table(date=as.Date(c("2013-07-02","2013-08-03","2013-09-04", "2013-10-05","2013-11-06")),…
black_sheep07
  • 2,308
  • 3
  • 26
  • 40
33
votes
4 answers

R: How to filter/subset a sequence of dates

I have this data: (complete for December) date sessions 1 2014-12-01 1932 2 2014-12-02 1828 3 2014-12-03 2349 4 2014-12-04 8192 5 2014-12-05 3188 6 2014-12-06 3277 And a need to subset/filter this, for example from…
Omar Gonzales
  • 3,806
  • 10
  • 56
  • 120
33
votes
3 answers

Subset data.table by logical column

I have a data.table with a logical column. Why the name of the logical column can not be used directly for the i argument? See the example. dt <- data.table(x = c(T, T, F, T), y = 1:4) # Works dt[dt$x] dt[!dt$x] # Works dt[x == T] dt[x == F] #…
djhurio
  • 5,437
  • 4
  • 27
  • 48
32
votes
2 answers

Subset data.frame by date

I have a dataset called EPL2011_12. I would like to make new a dataset by subsetting the original by date. The dates are in the column named Date The dates are in DD-MM-YY format. I have tried EPL2011_12FirstHalf <- subset(EPL2011_12, Date >…
user1899793
  • 421
  • 1
  • 4
  • 4
31
votes
2 answers

Subsetting data.table using variables with same name as column

I want to subset a data.table using a variable which has the same name as the column which leeds to some problems: dt <- data.table(a=sample(c('a', 'b', 'c'), 20, replace=TRUE), b=sample(c('a', 'b', 'c'), 20, replace=TRUE), …
jakob-r
  • 6,824
  • 3
  • 29
  • 47
30
votes
3 answers

Subset based on variable column name

I'm wondering how to use the subset function if I don't know the name of the column I want to test. The scenario is this: I have a Shiny app where the user can pick a variable on which to filter (subset) the data table. I receive the column name…
adv12
  • 8,443
  • 2
  • 24
  • 48