First off, StackOverFlow keeps saying there are answers already, but I've been looking for 2.5 hours now and nothing is available.
I'm attempting to view values from a dataframe with 940 rows. I would like to view the calories associated to the user IDs from the first and last dates of the trial.
Id ActivityDay Calories
1 1503960366 2016-04-12 1985
2 1624580081 2016-04-12 1432
3 1644430081 2016-04-12 3199
4 1844505072 2016-04-12 2030
5 1927972279 2016-04-12 2220
6 2022484408 2016-04-12 2390
7 2026352035 2016-04-12 1459
8 2320127002 2016-04-12 2124
9 2347167796 2016-04-12 2344
10 2873212765 2016-04-12 1982
11 3372868164 2016-04-12 1788
12 3977333714 2016-04-12 1450
13 4020332650 2016-04-12 3654
14 4057192912 2016-04-12 2286
15 4319703577 2016-04-12 2115
16 4388161847 2016-04-12 2955
17 4445114986 2016-04-12 2113
18 4558609924 2016-04-12 1909
19 4702921684 2016-04-12 2947
20 5553957443 2016-04-12 2026
21 5577150313 2016-04-12 3405
22 6117666160 2016-04-12 1496
23 6290855005 2016-04-12 2560
24 6775888955 2016-04-12 1841
25 6962181067 2016-04-12 1994
26 7007744171 2016-04-12 2937
27 7086361926 2016-04-12 2772
28 8053475328 2016-04-12 3186
29 8253242879 2016-04-12 2044
30 8378563200 2016-04-12 3635
31 8583815059 2016-04-12 2650
32 8792009665 2016-04-12 2044
33 8877689391 2016-04-12 3921
34 1503960366 2016-04-13 1797
35 1624580081 2016-04-13 1411
36 1644430081 2016-04-13 2902
37 1844505072 2016-04-13 1860
38 1927972279 2016-04-13 2151
39 2022484408 2016-04-13 2601
40 2026352035 2016-04-13 1521
41 2320127002 2016-04-13 2003
42 2347167796 2016-04-13 2038
43 2873212765 2016-04-13 2004
44 3372868164 2016-04-13 2093
45 3977333714 2016-04-13 1495
46 4020332650 2016-04-13 1981
47 4057192912 2016-04-13 2306
48 4319703577 2016-04-13 2135
49 4388161847 2016-04-13 3092
50 4445114986 2016-04-13 2095
51 4558609924 2016-04-13 1722
52 4702921684 2016-04-13 2898
This is the sample data...ommiting the other nearly 900 rows... I want to keep only the date of 2016-04-12, AND 2016-05-12. That is the range of which the data was taken from. I'd like to see the IDs of the users, and their calories from those 2 dates only.
I've tried about 50 codes...here is where I'm at right now:
Daily_Calories %>%
group_by(Id, Calories) %>%
arrange(ActivityDay) %>%
as.data.frame()
I have not saved all the codes I've tried, as I'm new and RStudio gets messy and unorganized quickly...and then I get a bit lost.
I've also tried:
Daily_Calories %>%
group_by(Id, Calories) %>%
group_by(min(ActivityDay), max(ActivityDay)) %>%
arrange(ActivityDay) %>%
as.data.frame()
and got this:
Id ActivityDay Calories min(ActivityDay) max(ActivityDay)
1 1503960366 2016-04-12 1985 2016-04-12 2016-05-12
2 1624580081 2016-04-12 1432 2016-04-12 2016-05-12
3 1644430081 2016-04-12 3199 2016-04-12 2016-05-12
4 1844505072 2016-04-12 2030 2016-04-12 2016-05-12
5 1927972279 2016-04-12 2220 2016-04-12 2016-05-12
6 2022484408 2016-04-12 2390 2016-04-12 2016-05-12
7 2026352035 2016-04-12 1459 2016-04-12 2016-05-12
8 2320127002 2016-04-12 2124 2016-04-12 2016-05-12
9 2347167796 2016-04-12 2344 2016-04-12 2016-05-12
10 2873212765 2016-04-12 1982 2016-04-12 2016-05-12
11 3372868164 2016-04-12 1788 2016-04-12 2016-05-12
12 3977333714 2016-04-12 1450 2016-04-12 2016-05-12
and then tried this:
Daily_Calories %>%
group_by(Id, Calories) %>%
arrange(ActivityDay) %>%
summarise(min(ActivityDay), max(ActivityDay)) %>%
as.data.frame()
and got this:
Id Calories min(ActivityDay) max(ActivityDay)
1 1503960366 0 2016-05-12 2016-05-12
2 1503960366 1728 2016-04-17 2016-04-17
3 1503960366 1740 2016-05-08 2016-05-08
4 1503960366 1745 2016-04-15 2016-04-15
5 1503960366 1775 2016-04-21 2016-04-21
6 1503960366 1776 2016-04-14 2016-04-14
7 1503960366 1783 2016-05-11 2016-05-11
8 1503960366 1786 2016-04-20 2016-04-20
9 1503960366 1788 2016-04-24 2016-04-24
I'm not looking for the minimum and maximum calories, simply, the "minimum" and "maximum" dates...meaning, 2016-04-12, and 2016-05-12. All three of these codes I just tried had 700+ rows omitted from the results, which signifies they are wrong. There are 33 users, and 2 dates, so there should be 66 rows for results.
I hope this is explained well enough, I'm trying to be better with my questions. I appreciate the time and help.
Almost forgot, I wasn't wanting to create a new dataframe, just see the results. That's why my code starts with just the dataframe. Does it make a difference? I'd prefer the results in the console for viewing. Cheers!