21

I have a data.table dt:

library(data.table)
dt = data.table(a=LETTERS[c(1,1:3)],b=4:7)

   a b
1: A 4
2: A 5
3: B 6
4: C 7

The result of dt[, .N, by=a] is

   a N
1: A 2
2: B 1
3: C 1

I know the by=a or by="a" means grouped by a column and the N column is the sum of duplicated times of a. However, I don't use nrow() but I get the result. The .N is not just the column name? I can't find the document by ??".N" in R. I tried to use .K, but it doesn't work. What does .N means?

user438383
  • 5,716
  • 8
  • 28
  • 43
Eric Chang
  • 2,580
  • 4
  • 19
  • 19
  • 8
    An explanation of `.N` is in `?data.table` documentation under Arguments->by – digEmAll Oct 13 '15 at 12:35
  • 3
    More info in this cheat sheet https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf – Pierre L Oct 13 '15 at 12:42
  • I got the point why I can't find the document about ".N" in RStudio, because the ".N" information is written in the pdf reference manual but not in html document. Thank digEmAll and Pierre Lafortune. The cheat sheet is interesting and helpful for me to improve my coding skill. – Eric Chang Oct 13 '15 at 12:48
  • Though this is a noob question, would you like to write an answer, @digEmAll? Maybe it's helpful to the new hand like me to manipulate data.table. – Eric Chang Oct 13 '15 at 12:57
  • 1
    Please read the *Introduction to data.table* vignette either from the [github project page](https://github.com/Rdatatable/data.table/wiki/Getting-started) or from [CRAN's data.table page](https://cran.r-project.org/web/packages/data.table/index.html). – Arun Oct 13 '15 at 13:02

1 Answers1

28

Think of .N as a variable for the number of instances. For example:

dt <- data.table(a = LETTERS[c(1,1:3)], b = 4:7)

dt[.N] # returns the last row
#    a b
# 1: C 7

Your example returns a new variable with the number of rows per case:

dt[, new_var := .N, by = a]
dt
#    a b new_var
# 1: A 4       2 # 2 'A's
# 2: A 5       2
# 3: B 6       1 # 1 'B'
# 4: C 7       1 # 1 'C'

For a list of all special symbols of data.table, see also https://www.rdocumentation.org/packages/data.table/versions/1.10.0/topics/special-symbols

David
  • 9,216
  • 4
  • 45
  • 78