I want to repeat a particular query on a large dataset and I am sure the answer to my question is quite basic, but after reading various sources on 'for' loops, repeat and replicate functions for about 2 hours, I still can't find any examples which appear to do what I need to do.
The dataset contains survey data from particular sites which are split into plots and each plot contains multiple species entries so the data looks like this:
SITE PLOT SPECIES
1 1 a
1 1 b
1 2 a
1 2 c
1 3 b
1 3 c
1 3 d
1 4 a
1 5 a
1 5 b
2 1 b
2 1 c
2 3 a
2 3 b
2 4 b
2 4 c
2 4 d
2 5 e
The actual data is over 6500 rows as there are hundreds of sites and each should contain 20 plots - the issue is some plots are missing from some sites, so what I need to do is establish how many plots are missing in total. I can use the following code to query how many unique plots are on each site so in the example below I query how many unique plots are in site number 7:
NROW(unique(df$PLOT[df$SITE=="7"]))
[20]
But I have hundreds of sites, so is there a function that will allow me to query each site automatically without manually changing the site number each time?