0

I have a list of POSIX timestamps (tweet dataset). I want to select a specific week-long period (Friday noon - Friday noon) and count how many tweets were published between current system time (e.g., Wednesday, 16pm) and the end of the period.

This code obviously doesn't work because current time is always "higher" than history.

time.now=as.POSIXct(Sys.time())
sum(data$week==15 & data$time > time.now)

Is there a way to transform my data into a date-agnostic format that would start and end on Friday noon and only specified time and weekday?

Thanks!

  • 1
    I'm sorry, but it is not clear what you are wanting to do. Are you trying to redefine a 'week' so that it starts on Friday noon instead of Sunday midnight? And then look in each new 'week' from the current date to the end of the 'week'? Some simple example data covering 2 weeks would be really helpful to getting an appropriate answer. – thelatemail May 10 '17 at 22:27

1 Answers1

1

Since you don't provide a reproducible example I will try to explain it as simple as possible. You should add a small section of your dataset tho.

"I want to select a specific week-long period":

You could define what your first and last points are e.g.: noon of first Friday 2017 and Friday this week.

f1 <- strptime("2017-01-06 12:00", format = "%Y-%m-%d %H:%M", tz = "UTC") # first Friday 2017
f2 <- strptime("2017-05-12 12:00", format = "%Y-%m-%d %H:%M", tz = "UTC") # this week

Then generate a sequence of POSIXt from Friday to Friday

seq <- seq.POSIXt(f1, f2, by = "week")

"Count how many tweets were published between current system time":

Then you could use cut to put into bins the tweets (count how many you get from one Friday to another)e.g.:

cut(dataset, breaks = seq, labels = 1:length(seq), right = TRUE)

Finally you need to group by bin and count the occurrence. Is this what you want? Hope this helps.

Edgar Santos
  • 3,426
  • 2
  • 17
  • 29