0

I have hotel booking data and there's an arrival and a departure date. I have successfully counted the days in between using difftime but I would now like to know the number of dates per month. If both arrival and departure date are within one month (like arrival on September 1st and departure on September 10th) that's not a problem of course but what do I do with bookings that are across months like arrival on September 25th and departure on October 4th or even years? In this case I would like to calculate how many days fall in September and how many days fall in October.

The overall goal is to calculate booked days per month / year.

Sofia
  • 21
  • 3

1 Answers1

0

Since you included no sample data (may I suggest you do so in next questions), I made it up to replicate what you want:

library(lubridate)
library(tidyverse)

#creating sample data
bookings <- tibble(
  pax = c("Jane", "John"),
  arrival = as.Date(c("2020-12-20", "2021-01-25")),
  departure = as.Date(c("2021-01-04", "2021-02-02"))
)

#creating a column with all booked dates to group_by and summarize
bookings <- bookings %>% 
  rowwise() %>% 
  mutate(booked_dates = list(seq(arrival, departure, by="days"))) %>% # this creates a column of tiny dataframes with the occupied dates by pax
  unnest(cols = booked_dates) %>% # this flattens the list-column into a regular one 
  mutate( # extracting the year and month
    year = year(booked_dates),
    month = month(booked_dates, label = TRUE)
  ) %>% 
  group_by(year, month) %>% # grouping and summarizing
  summarise(n_days = n())

Then you have the desired output:

bookings
# A tibble: 3 × 3
# Groups:   year [2]
   year month n_days
  <dbl> <ord>  <int>
1  2020 Dec       12
2  2021 Jan       11
3  2021 Feb        2
Gonzalo T F
  • 111
  • 10