Removing duplicate rows from relation

Question

I have the following code which produces a relation:

SELECT book_id, shipments.customer_id
FROM shipments 
LEFT JOIN editions ON (shipments.isbn = editions.isbn)
LEFT JOIN customers ON (shipments.customer_id = customers.customer_id)

In this relation, there are customer_ids as well as book_ids of books they have bought. My goal is to create a relation with each book in it and then how many unique customers bought it. I assume one way to achieve this is to eliminate all duplicate rows in the relation and then counting the instances of each book_id. So my question is: How can I delete all duplicate rows from this relation?

Thanks!

EDIT: So what I mean is that I want all the rows in the relation to be unique. If there are three identical rows for example, two of them should be removed.

Please define *duplicates*: duplicates in terms of ordered books, or in terms of customers who placed the orders? [after rereading: customers ... No: you also want a count. Please rephrase.] — wildplasser, Sep 10 '15 at 23:09
Craig, I have experimented with the GROUP BY but with no good results, any suggestions to how I could use it? — Sahand, Sep 10 '15 at 23:24

score 0 · Answer 1 · answered Sep 10 '15 at 23:45

0

This will give you all the {customer,edition} pairs for which an order exists:

SELECT *
FROM customers c
JOIN editions e ON (
    SELECT * FROM shipments s
    WHERE s.isbn = e.isbn
      AND s.customer_id = c.customer_id
    );

answered Sep 10 '15 at 23:45

wildplasser

43,142
8
66
109

Patrick · Answer 2 · 2015-09-11T02:41:50.487

The duplicates are in table shipments. You can remove these with a DISTINCT clause and then count them in an outer query GROUP BY isbn:

SELECT isbn, count(customer_id) AS unique_buyers
FROM (
  SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer
GROUP BY isbn;

If you want a list of all books, even where no purchases were made, you should LEFT JOIN the above to the list of all books:

SELECT isbn, coalesce(unique_buyers, 0) AS books_sold_to_unique_buyers
FROM editions
LEFT JOIN (
  SELECT isbn, count(customer_id) AS unique_buyers
  FROM (
    SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer
  GROUP BY isbn) books_bought USING (isbn)
ORDER BY isbn;

You can write this more succinctly by joining before counting:

SELECT isbn, count(customer_id) AS books_sold_to_unique_buyers
FROM editions
LEFT JOIN (
  SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer USING (isbn)
GROUP BY isbn
ORDER BY isbn;

Removing duplicate rows from relation

2 Answers2