I have the following R data.table
:
library(data.table)
dt =
unique_point biased data_points team groupID
1: up1 FALSE 3 1 xy28352
2: up1 TRUE 4 22 xy28352
3: up2 FALSE 1 4 xy28352
4: up2 TRUE 0 3 xy28352
5: up3 FALSE 12 5 xy28352
6: up3 TRUE 35 7 xy28352
....
I've formatted the data.table such that for each unique_point
, I am measuring the data points for unbiased
and biased
. So each unique_point
has two rows, biased FALSE and biased TRUE. If there are no measurements, this is recorded as 0.
As an example, for up1
, there are 3 data points for the unbiased experiment, and 4 data points for the biased experiment.
Each groupID
has 25 teams, each with potentially with a measurement for biased
and unbiased
. I would like to re-format the data.table so it calculates the number of data points by team as well, for each unique data points (due to the data, this will make rows have data_points
of 0).
unique_point biased data_points team groupID
1: up1 FALSE 3 1 xy28352
2: up1 TRUE 0 1 xy28352
3: up1 FALSE 0 2 xy28352
4: up1 TRUE 0 2 xy28352
5: up1 FALSE 0 3 xy28352
6: up1 TRUE 0 3 xy28352
....
45. up1 TRUE 4 22 xy28352
....
49. up1 FALSE 0 25 xy28352
50. up1 TRUE 0 25 xy28352
This task is very close to somehow "unfolding" the data.table. For each unique_point
, I would create 50 rows, 25 teams with TRUE and FALSE. The added complication is that I need to use the counts
above to fill in the above with the counts.
There should be a way to use unique()
to count the times the rows exist possibly?
If I try
setkey(dt, team, unique_point)[CJ(unique(unique_point), unique(team)), .N, by=.EACHI]
I am counting the number of rows which occur for unique_point
and team
. But this wouldn't keep the data_points
.