Questions tagged [subset]

A subset consists of those elements selected from a larger set of elements, by their position in the larger set or other features, such as their value.

Definition:

From Wikipedia:

a set A is a subset of a set B, or equivalently B is a superset of A, if A is 'contained' inside B, that is, all elements of A are also elements of B.

Uses:

  • In , subset is a function that selects a subset of elements from a vector, matrix, or data frame, given some logical expression (caution: subset drops incomplete cases; see How to subset data in R without losing NA rows?). However, for programmatic use (as opposed to interactive use) it is better to use the \[ (or [[) operators or the filter function from dplyr. substring is used to find subsets of character strings.
  • In , a subset of an array can be obtained with array[indices].
6799 questions
2
votes
4 answers

Filtering a list of integer in range, to exclude the subsets in python

I'm trying to find a faster way to filter my list of ranges, so that any range that can be covered completely by a larger range will be excluded. For example, #all ranges have width >1, which means no such case like xx=[1,1] in my list #each range…
Helene
  • 953
  • 3
  • 12
  • 22
2
votes
2 answers

How can I subset by date, using a wild card?

I have a data frame : $Date, $name, $value 1949-05-01, Hurricane, 5 1950-02-01, Hurricane, 6 1950-03-01, 1950-04-01, 1950-05-01, 1951-02-01, 1951-03-01, 1951-04-01, These dates go all the way to 2015, with measurements for months 02, 03,04 and 05.…
Christopher
  • 189
  • 1
  • 10
2
votes
2 answers

Coq Program matching on pair

I was trying to do a safe get function for list using subset types. I tryied this definition using program Program Fixpoint get A (l : list A) (n : {x : nat | x < length l} ) : A := match (n, l) with | (O, x :: l') => x | (S n', x :: l') =>…
Nico Lehmann
  • 177
  • 6
2
votes
2 answers

How do I extract a subset of a time series according to a custom interval using pandas?

I have a dataset of forex prices for every minute, 24 hours a day, every day, for one month. However, the forex market is only actually open from 17:00 on Sunday to 16:00 on Friday, the data in between these times is simply padded with the last…
Tom
  • 61
  • 6
2
votes
2 answers

subset based on date in a reference table

I have table1 as follows. StudentId Date1 Lunch 23433 2014-08-26 Yes 233989 2014-08-18 No 909978 2014-08-06 No 777492 2014-08-11 Yes 3987387 2014-08-26 …
2
votes
1 answer

dplyr group_by abs() filtering of data

Say I have data as follows A <- c(1,1,1,2,2,2,3,3,3) B <- c(1,0,0,1,0,0,1,0,0) C <- c(8,7,6,8,7,8,9,9,11) D <- data.frame(A,B,C) D library(dplyr) E <- D %>% group_by(B) %>% filter(abs(diff(C)) <= 1) to remove these cases, so that those shown…
lukeg
  • 1,327
  • 3
  • 10
  • 27
2
votes
2 answers

Multiple filter using grep and subset in R

I'm trying to create a filter to remove lines from a dataset using grep and subset together. Sample dataset: id <- 1:10 problem <- c("a" , "b", "c", "d", "a","b","c","a", "b", "a") solution1 <- c("eat", "sleep", "drink", "play", "sleep", "play",…
Andrew Fang
  • 23
  • 1
  • 3
2
votes
1 answer

Subset duplicates based on two columns

My data looks like this: A B 1 2 1A 2 1A 2 2 3 2 4 2 4 3A 0 3A 0 4A 1 4A 1 5 5 I want to subset the data, and extract all records that are duplicates, based on values on both columns. I tried using cbind, and unique, but they…
Litwos
  • 1,278
  • 4
  • 19
  • 44
2
votes
1 answer

Subset a data frame in R based on above and below a threshold value

I searched a lot to find similar post to my post below but no luck yet I have 1 column of data like below (extracted from original big file having many columns) C1 0 1 2 3 4 3 3 2 1 From this data I want to generate a new column C2 where in…
2
votes
4 answers

Dynamic Programing approach for a subset sum

Given the following Input 10 4 3 5 5 7 Where 10 = Total Score 4 = 4 players 3 = Score by player 1 5 = Score by player 2 5 = Score by player 3 7 = Score by player 4 I am to print players who's combine score adds to total so output can be 1 4…
user1010101
  • 2,062
  • 7
  • 47
  • 76
2
votes
1 answer

How to assign to a subset of an R object with a name given as string

I have the name of a matrix as string and would like to assign to a column of that matrix. A <- matrix(1:4,2) v <- 10:11 name <- "A" get(name)[,2] <- v This does not work because the LHS is just a value (i.e. a vector) and has lost the meaning of…
user1965813
  • 671
  • 5
  • 16
2
votes
4 answers

Loop or apply for sum of rows based on multiple conditions in R dataframe

I've hacked together a quick solution to my problem, but I have a feeling it's quite obtuse. Moreover, it uses for loops, which from what I've gathered, should be avoided at all costs in R. Any and all advice to tidy up this code is appreciated. I'm…
Zenit
  • 135
  • 1
  • 5
2
votes
1 answer

DataFrames Find rows (and index) satisfying a condition via expression

Given a DataFrame an an expression, I would like to be able to subset the Dataframe using this expression. Also I would like to receive the index vector telling me which rows satisfy the conditions. I provide an example: df = DataFrame(x1 = 1:3, x2…
user2546346
  • 145
  • 1
  • 1
  • 8
2
votes
2 answers

Calculating the slope of each row in a large data set using R

I have a large data set of the following format: First column is type, and the subsequent columns are different times that 'type' happens. I want to calculate the slope of each row (~7000 rows) for subset T0-T2 and then t0-t2 and output that…
Anita
  • 45
  • 2
  • 4
2
votes
1 answer

Subset multiples data frames in a list that match a certain condition

I am new in this and i am stuck. I have a list of data frames that have information about pressure, temperature and salinity. I want to subset all of them and keep only the values of temperature and salinity when the pressure is equal to 5. Below…
1 2 3
99
100