This df is from the complete journey package. have join df's in order to show the transaction_timestamp, household_id, product_type and product_kind). Product kind is a variable created to show what type of item bought was. The problem is, each transaction timestamp and the household id are repeated with each row showing a different product_kind. How can I make it so that there is a new row which combines the transaction?
trying to create a new row that shows when both beer and ice were purchased during the same transaction_timestamp and household_id, while also showing when they were purchased by themselves.
library(completejourney)
transactions <- get_transactions()
transactions%>%
inner_join(products, by = "product_id")%>%
mutate(product_kind = str_extract(product_type, regex("Wine|Beer|ICE - CRUSHED/CUBED|DELI TRAY:MEAT AND CHEESE",ignore_case = TRUE))) %>%
filter(!str_detect(product_type,regex("cooking|vinegar",ignore_case = TRUE )))%>%
filter(str_detect(product_kind,regex("Wine|Beer|ICE - CRUSHED/CUBED|Cheese",ignore_case = TRUE)))%>%
group_by(product_kind)%>%
summarise(total_sales_value = sum(sales_value, total_sales_quantity = sum(quantity)))%>%
arrange(desc(total_sales_value))
transactions%>%
select(transaction_timestamp, household_id, product_type, product_kind)%>%
group_by(transaction_timestamp)%>%
mutate(Cold_One = case_when(product_kind == 'BEER'& product_kind == "ICE - CRUSHED/CUBED" ~"Beer and Ice"))%>%
count(Cold_One)