I have the following dataset
> dataset2
ID ATCcode date
1 1 N06AA 2001-01-01
2 1 N06AB 2001-04-01
3 1 N06AB 2001-03-01
4 1 N06AB 2001-02-01
5 1 N06AC 2001-01-01
6 2 N06AA 2001-01-01
7 2 N06AA 2001-02-01
8 2 N06AA 2001-03-01
9 3 N06AB 2001-01-01
10 4 N06AA 2001-02-01
11 4 N06AB 2001-03-01
It's in long format and I'd like it to be in wide format. However, I only want the earliest date for each ATCcode - and not any of the later dates. Thus I'd like to end up here:
> datasetLong
ID N06AA N06AB N06AC
1 1 2001-01-01 2001-02-01 2001-01-01
2 2 2001-01-01 <NA> <NA>
3 3 <NA> 2001-01-01 <NA>
4 4 2001-02-01 2001-03-01 <NA>
(This is just a sample of the real dataset, it has more variation in the values and more observations than this).
I've managed to cast the dataset somewhat, but not in the manner which I want to:
dataset3 <- reshape2::dcast(dataset2, ID ~ ATCcode)
gives me the length of each vector/list:
> dataset3
ID N06AA N06AB N06AC
1 1 1 3 1
2 2 3 0 0
3 3 0 1 0
4 4 1 1 0
Instead of the length, I'd like just one value, and that value should be the smallest value (or, the earliest date).
I've found a similar question asked on StackOverflow, but I was unable to use that in any way without getting various errors. I have not used melt in the above attempt, is that maybe necessary? Any help is appreciated.