2

The data in my table looks like this:

date, app, country, sales
2017-01-01,XYZ,US,10000
2017-01-01,XYZ,GB,2000
2017-01-02,XYZ,US,30000
2017-01-02,XYZ,GB,1000

I need to find, for each app on a daily basis, the ratio of US sales to GB sales, so ideally the result would look like this:

date, app, ratio
2017-01-01,XYZ,10000/2000 = 5
2017-01-02,XYZ,30000/1000 = 30

I'm currently dumping everything into a csv and doing my calculations offline in Python but I wanted to move everything onto the SQL side. One option would be to aggregate each country into a subquery, join and then divide, such as

select d1_us.date, d1_us.app, d1_us.sales / d1_gb.sales from
(select date, app, sales from table where date between '2017-01-01' and '2017-01-10' and country = 'US') as d1_us
join 
(select date, app, sales from table where date between '2017-01-01' and '2017-01-10' and country = 'GB') as d1_gb
on d1_us.app = d1_gb.app and d1_us.date = d1_gb.date

Is there a less messy way to go about doing this?

Craig
  • 1,929
  • 5
  • 30
  • 51

3 Answers3

8

You can use the ratio of SUM(CASE WHEN) and GROUP BY in your query to do this without requiring a subquery.

SELECT DATE, 
       APP,
       SUM(CASE WHEN COUNTRY = 'US' THEN SALES ELSE 0 END) /
       SUM(CASE WHEN COUNTRY = 'GB' THEN SALES END) AS RATIO    
FROM TABLE1
GROUP BY DATE, APP;

Based on the likelihood of the GB sales being zero, you can tweak the GB's ELSE condition, maybe ELSE 1, to avoid Divide by zero error. It really depends on how you want to handle exceptions.

Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
Vash
  • 1,767
  • 2
  • 12
  • 19
0

You can use one query with grouping and provide the condition once:

SELECT date, app,
       SUM(CASE WHEN country = 'US' THEN SALES ELSE 0 END) /
       SUM(CASE WHEN country = 'GB' THEN SALES END) AS ratio
WHERE date between '2017-01-01' AND '2017-01-10'
FROM your_table
GROUP BY date, app;

However, this gives you zero if there are no records for US and NULL if there are no records for GB. If you need to return different values for those cases, you can use another CASE WHEN surrounding the division. For example, to return -1 and -2 respectively, you can use:

SELECT date, app,
       CASE WHEN COUNT(CASE WHEN country = 'US' THEN 1 ELSE 0 END) = 0 THEN -1
            WHEN COUNT(CASE WHEN country = 'GB' THEN 1 ELSE 0 END) = 0 THEN -2
            ELSE SUM(CASE WHEN country = 'US' THEN SALES ELSE 0 END) /
                 SUM(CASE WHEN country = 'GB' THEN SALES END)
            END AS ratio
WHERE date between '2017-01-01' AND '2017-01-10'
FROM your_table
GROUP BY date, app;
Racil Hilan
  • 24,690
  • 13
  • 50
  • 55
0
DROP TABLE IF EXISTS t;
CREATE TABLE t (
  date DATE,
  app VARCHAR(5),
  country VARCHAR(5),
  sales DECIMAL(10,2)
);

INSERT INTO t VALUES
  ('2017-01-01','XYZ','US',10000),
  ('2017-01-01','XYZ','GB',2000),
  ('2017-01-02','XYZ','US',30000),
  ('2017-01-02','XYZ','GB',1000);


WITH q AS (
    SELECT
      date,
      app,
      country,
      SUM(sales) AS sales
    FROM t
    GROUP BY date, app, country
) SELECT
    q1.date,
    q1.app,
    q1.country || ' vs ' || NVL(q2.country,'-') AS ratio_between,
    CASE WHEN q2.sales IS NULL OR q2.sales = 0 THEN 0 ELSE ROUND(q1.sales / q2.sales, 2) END AS ratio
  FROM q AS q1
    LEFT JOIN q AS q2 ON q2.date = q1.date AND
                    q2.app = q1.app AND
                    q2.country != q1.country
  -- WHERE q1.country = 'US'
  ORDER BY q1.date;

Results for any country vs any country (WHERE q1.country='US' is commented out)

date,app,ratio_between,ratio
2017-01-01,XYZ,GB vs US,0.20
2017-01-01,XYZ,US vs GB,5.00
2017-01-02,XYZ,GB vs US,0.03
2017-01-02,XYZ,US vs GB,30.00

Results for US vs any other country (WHERE q1.country='US' uncommented)

date,app,ratio_between,ratio
2017-01-01,XYZ,US vs GB,5.00
2017-01-02,XYZ,US vs GB,30.00

The trick is in JOIN clause. Results of a subquery q which aggregates data by date, app and country are joined with results themselves but on date and app.

This way, for every date, app and country we get a "match" with any another country on same date and app. By adding q1.country != q2.country, we exclude results for same country, highlighted below with *

date,app,country,sales,date,app,country,sales
*2017-01-01,XYZ,GB,2000.00,2017-01-01,XYZ,GB,2000.00*
2017-01-01,XYZ,GB,2000.00,2017-01-01,XYZ,US,10000.00
2017-01-01,XYZ,US,10000.00,2017-01-01,XYZ,GB,2000.00
*2017-01-01,XYZ,US,10000.00,2017-01-01,XYZ,US,10000.00*
2017-01-02,XYZ,GB,1000.00,2017-01-02,XYZ,US,30000.00
*2017-01-02,XYZ,GB,1000.00,2017-01-02,XYZ,GB,1000.00*
*2017-01-02,XYZ,US,30000.00,2017-01-02,XYZ,US,30000.00*
2017-01-02,XYZ,US,30000.00,2017-01-02,XYZ,GB,1000.00
Pawel
  • 626
  • 5
  • 7