0

So I'm working on an RFM analysis, and with lots of help, was able to put together the following query that outputs the customer_id, r score, f score, m score, and lastly a combined rfm score:

--This will first create quintiles using the ntile function
--Then factor in the conditions
--Then combine the score
--Then the substrings will seperate each score's individual points

SELECT *,
    SUBSTRING(rfm_combined,1,1) AS recency_score,
    SUBSTRING(rfm_combined,2,1) AS frequency_score,
    SUBSTRING(rfm_combined,3,1) AS monetary_score
FROM (

SELECT
    customer_id,
    rfm_recency*100 + rfm_frequency*10 + rfm_monetary AS rfm_combined
FROM
    (SELECT
    customer_id,
    ntile(5) over (order by last_order_date) AS rfm_recency,
    ntile(5) over (order by count_order) AS rfm_frequency,
    ntile(5) over (order by total_spent) AS rfm_monetary
FROM
    (SELECT
    customer_id,
    MAX(oms_order_date) AS last_order_date,
    COUNT(*) AS count_order,
    SUM(quantity_ordered * unit_price_amount) AS total_spent
FROM 
    l_dmw_order_report
WHERE
    order_type NOT IN ('Sales Return', 'Sales Price Adjustment')
    AND item_description_1 NOT IN ('freight', 'FREIGHT', 'Freight')
    AND line_status NOT IN ('CANCELLED', 'HOLD')
    AND oms_order_date BETWEEN '2019-01-01' AND CURRENT_DATE
    AND customer_id = 'US621111112234061'

GROUP BY customer_id))

ORDER BY customer_id desc)

In the above, you will notice that I am forcing it to only output on a particular customer_id. That is because I wanted to test to see if this query is accounting for when a customer_id appears in multiple YearMonth categories (because they could have bought in Jan, then again in Feb, then again in Nov).

The issue here is that, although the query outputs the right scores, it only seems to be accounting for the customer_id once, regardless of if it appears in multiple months. For this particular customer ID, I see that they appear in Jan 2019, Feb 2019, and Nov 2019, so it should be giving me 3 rows instead of just 1. Been testing for a few hours and can't seem to find the cause, but I suspect that my grouping may be wrong.

Thank you for your help and let me know if you have any questions!!

Best,

Z

Z41N
  • 97
  • 10
  • You have `group by customer_id` in the subquery. Why would you expect more than one row for a `customer_id` in the result set? – Gordon Linoff Dec 06 '19 at 20:44
  • That's why I mentioned it in my post, but I'm having trouble wrapping my head around how I can ungroup this so that it pulls every customer for every YearMonth combo that customer is in. I'm trying to remove the grouping but then I'm getting a bit confused on how I can keep the columns while also removing the grouping.. – Z41N Dec 09 '19 at 14:39

0 Answers0