Can anyone explain what this line t[exists,][1:6,]
is doing in the code below and how that subsetting works?
t<-trees
t[1,1]= NA
t[5,3]= NA
t[1:6,]
exists<-complete.cases(t)
exists
t[exists,][1:6,]
Can anyone explain what this line t[exists,][1:6,]
is doing in the code below and how that subsetting works?
t<-trees
t[1,1]= NA
t[5,3]= NA
t[1:6,]
exists<-complete.cases(t)
exists
t[exists,][1:6,]
The complete.cases
function will check the data frame and will return a vector of TRUE
and FALSE
where a TRUE
indicates a row with no missing data. The vector will be as long as there are rows in t
.
The t[exits,]
part will subset the data so that only rows where exists
is true will be considered - the row that have missing data will be FALSE
in exists
and removed. The [1:6,]
will only take the first 6 rows where there is no missing data.
In R, [
is a function like any other. R parses t[exists, ]
as
`[`(t, exists) # don't forget the backticks!
Indeed you can always call [
with the backtick-and-parentheses syntax, or even crazier use it in constructions like
as.data.frame(lapply(t[exists, ], `[`, 1:6, ))
which, believe it or not, is (almost) equivalent to t[exists,][1:6,]
.
The same is true for functions like [[
, $
, and more exotic stuff like names<-
, which is a special function to assign argument value
to the names
attribute of an object. We use functions like this all the time with syntax like
names(iris) <- tolower(names(iris))
without realizing that what we're really doing is
`names(iris)<-`(iris, tolower(names(iris))
And finally, you can type
?`[`
for documentation, or type
`[`
to return the definition, just like any other function.
t[exists,][1:6,]
doesThe simple answer is that R parses t[exists,][1:6,]
as something like:
t
TRUE
elements of exists
.1:6
, i.e. rows 1 through 6The more complicated answer is that this is handled by the parser as:
`[`(`[`(t, exists, ), 1:6, ) # yes, this has blank arguments
which a human can interpret as
temporary_variable_1 <- `[`(t, exists, )
temporary_variable_2 <- `[`(temporary_variable_1, 1:6, )
print(temporary_variable_2) # implicitly, sending an object by itself to the console will `print` that object
Interestingly, because you typically can't pass blank arguments in R, certain constructions are impossible with the bracket function, like eval(call("[", t, exists, ))
which will throw an undefined columns selected
error.