0

This df is from the complete journey package. have join df's in order to show the transaction_timestamp, household_id, product_type and product_kind). Product kind is a variable created to show what type of item bought was. The problem is, each transaction timestamp and the household id are repeated with each row showing a different product_kind. How can I make it so that there is a new row which combines the transaction?

trying to create a new row that shows when both beer and ice were purchased during the same transaction_timestamp and household_id, while also showing when they were purchased by themselves.

library(completejourney)
transactions <- get_transactions()

transactions%>%
  inner_join(products, by = "product_id")%>%
  mutate(product_kind = str_extract(product_type, regex("Wine|Beer|ICE - CRUSHED/CUBED|DELI TRAY:MEAT AND CHEESE",ignore_case = TRUE))) %>%
  filter(!str_detect(product_type,regex("cooking|vinegar",ignore_case = TRUE )))%>%
  filter(str_detect(product_kind,regex("Wine|Beer|ICE - CRUSHED/CUBED|Cheese",ignore_case = TRUE)))%>%
  group_by(product_kind)%>%
  summarise(total_sales_value = sum(sales_value, total_sales_quantity = sum(quantity)))%>%
  arrange(desc(total_sales_value))

transactions%>%
  select(transaction_timestamp, household_id, product_type, product_kind)%>%
  group_by(transaction_timestamp)%>%
  mutate(Cold_One = case_when(product_kind == 'BEER'& product_kind == "ICE - CRUSHED/CUBED" ~"Beer and Ice"))%>%
  count(Cold_One)
  • 2
    you should include the code that calls the other data and joins it so that someone can reproduce this and help you with it! – sconfluentus Oct 06 '22 at 21:43

0 Answers0