0

How can I modify this code to have an aggregated view in the timeline?

I don't want to have three different lines just for avocado. I want to have one line just for when avocado exists. Preferably I even want to have one line for the whole data rather than say one complete line for avocado, one complete line for strawberry and one complete line for blueberry. Any idea is really appreciated.

Besides, any idea how to show those time correctly? The differences between each two time is less than a second but what is shown in measured in year.

library("googleViz")
dd <- read.csv(header = TRUE, text = "rosbagTimestamp,data
1438293919014802388,avocado
1438293919078955343,avocado
1438293919082352685,avocado
1438293919146142553,0
1438293919177955753,0
1438293919244013175,strawberry
1438293919251252990,strawberry
1438293919322521358,blueberry
1438293919327731275,blueberry")

dd <- within(dd, {
  end <- as.POSIXct(as.numeric(substr(rosbagTimestamp, 1, 10)),
                    origin = '1970-01-01')
  start <- as.POSIXct(as.numeric(substr(rosbagTimestamp, 11, 19)),
                      origin = '1970-01-01')
  rosbagTimestamp <- NULL
})

#         data               start                 end
# 1    avocado 1970-06-21 03:47:12 2015-07-30 18:05:19
# 2    avocado 1972-07-02 16:01:04 2015-07-30 18:05:19
# 3    avocado 1972-08-10 23:44:00 2015-07-30 18:05:19
# 4          0 1974-08-19 07:07:44 2015-07-30 18:05:19
# 5          0 1975-08-22 12:10:40 2015-07-30 18:05:19
# 6 strawberry 1977-09-25 01:24:16 2015-07-30 18:05:19
# 7 strawberry 1977-12-17 19:29:52 2015-07-30 18:05:19
# 8  blueberry 1980-03-21 16:15:44 2015-07-30 18:05:19
# 9  blueberry 1980-05-21 00:26:40 2015-07-30 18:05:19

plot(gvisTimeline(dd, rowlabel = 'data', barlabel = 'data',
                  start = 'start', end = 'end'))

enter image description here

rawr
  • 20,481
  • 4
  • 44
  • 78
Mona Jalal
  • 34,860
  • 64
  • 239
  • 408

1 Answers1

2

You just need to divide by the appropriate magnitude and choose your favorite aggregation tool to add the times by group (if I understand you correctly)

library('googleVis')
dd <- read.csv(header = TRUE, text = "rosbagTimestamp,data
               1438293919014802388,avocado
               1438293919078955343,avocado
               1438293919082352685,avocado
               1438293919146142553,0
               1438293919177955753,0
               1438293919244013175,strawberry
               1438293919251252990,strawberry
               1438293919322521358,blueberry
               1438293919327731275,blueberry")

dd <- within(dd, {
  end <- as.POSIXct(as.numeric(substr(rosbagTimestamp, 1, 10)) / 1e8,
                    origin = '1970-01-01')
  start <- as.POSIXct(as.numeric(substr(rosbagTimestamp, 11, 19)) / 1e8,
                      origin = '1970-01-01')
  rosbagTimestamp <- NULL
})

## sum the times by group
dd1 <- aggregate(. ~ data, data = dd, sum)
dd1 <- within(dd1, {
  start <- as.POSIXct(start, origin = '1970-01-01')
  end <- as.POSIXct(end, origin = '1970-01-01')
})

#         data               start                 end
# 1          0 1969-12-31 19:00:03 1969-12-31 19:00:28
# 2    avocado 1969-12-31 19:00:01 1969-12-31 19:00:43
# 3  blueberry 1969-12-31 19:00:06 1969-12-31 19:00:28
# 4 strawberry 1969-12-31 19:00:04 1969-12-31 19:00:28

plot(gvisTimeline(dd1, rowlabel = 'data', barlabel = 'data',
                  start = 'start', end = 'end'))

enter image description here

rawr
  • 20,481
  • 4
  • 44
  • 78
  • so I am using your code except I am reading from a CSV and I get this error: `Invalid data at row #7: start(Sat Jul 29 67882 20:56:51 GMT-0500 (CDT)) > end(Wed Dec 31 1969 18:33:04 GMT-0600 (CST)).×` so I use this line in code for reading CSV: `dd <- read.csv("_slash_gaze.csv", header = TRUE, sep = ",")` and this is the csv http://pastebin.com/st8FwrQg Why the code breaks? – Mona Jalal Aug 07 '15 at 01:45
  • plus where usually post their .csv stuff online if not pastebin for asking questions in stackoverflow? Thanks! @rawr – Mona Jalal Aug 07 '15 at 02:38
  • actually it is kind of weird. now that I am running the same code again I get the error at row #5! `Invalid data at row #5: start(Sat Jul 27 256430 14:35:45 GMT-0500 (CDT)) > end(Wed Dec 31 1969 18:06:44 GMT-0600 (CST)).× ` – Mona Jalal Aug 07 '15 at 02:41
  • 1
    @MonaJalal yeah when you read it in, add `read.csv(..., colClasses = c('character','character'))` the times are being read in as integers which they are but since they are so big, they are printed/read by `substr` in scientific notation which causes problems. pastebin is fine if you need to post a large amount that wont fit in your question if all of it is relevant. you could also upload to github or use gists like [this](https://gist.github.com/MrFlick/c1183c911bc5398105d4) – rawr Aug 07 '15 at 02:41
  • so there is supposed to be no overlap between the timelines for avacado, berry or any other two fruit but as you see above there is overlap. Can you by any chance guess what is the source of the problem? Thank you! – Mona Jalal Aug 10 '15 at 22:34
  • 1
    @MonaJalal no I don't really have an idea. it would be best to figure out how the time stamps were coded. I played around with it until a correct-looking format came out based on your question (you expected it to be in seconds--that is where dividing by 1e8 came from). And really it was the only one I found that worked. So I'm not sure the proper way to parse the times, and that might be the source of the error. – rawr Aug 10 '15 at 23:05
  • I have tried all the 1e9, 1e8, 1e7, 1e10 and none is correct. So as an example the whole experiment is 4 minutes but it shows it is about 40 minute! – Mona Jalal Aug 10 '15 at 23:30
  • @MonaJalal the 1e8 is more about the scaling, the more you increase the factor the shorter this time. I was thinking more along the lines of how to separate out the long integers into start and end times (again, we were just assuming this was the format of the times, it could be something totally different, but as a happy coincidence, something worked, namely splitting digits 1-10 and 11-19). It's odd that the first part is 10 digits and the second is only 9 which makes me think that might still be the problem – rawr Aug 11 '15 at 00:24
  • so considering the fact that they are all happening in the same date, can we just consider the minute and seconds and millisecond this is happening? do you know how to do so? I guess that might somewhat solve the problem – Mona Jalal Aug 17 '15 at 15:02
  • Can you please take a look at here: http://stackoverflow.com/questions/32053999/how-to-split-the-timestamp-in-r-for-googlevis-for-no-overlap the splitting didn't work. – Mona Jalal Aug 17 '15 at 15:18