4

I was reading relational algebra from one of the textbook. I came across DIVIDE operation. From Wikipedia:

The division is a binary operation that is written as R ÷ S. The result consists of the restrictions of tuples in R to the attribute names unique to R, i.e., in the header of R but not in the header of S, for which it holds that all their combinations with tuples in S are present in R.

Thus if R is:

+----+----+
| A  | B  |
+----+----+
| a1 | b1 |
| a2 | b1 |
| a3 | b1 |
| a4 | b1 |
| a1 | b2 |
| a3 | b2 |
| a2 | b3 |
| a3 | b3 |
| a4 | b3 |
| a1 | b4 |
| a2 | b4 |
| a3 | b4 |
+----+----+

and S is:

+----+
| A  |
+----+
| a1 |
| a2 |
| a3 |
+----+

then the output should be:

+----+
| B  |
+----+
| b1 |
| b4 |
+----+

Now the book gives relational equivalent steps to calculate DIVIDE operations as below. I am trying to immitate it exactly in SQL Server, but out of direction

  T1 ← πBR                //that is SELECT A FROM R
  T2 ← πB((S x T1) - R)   //SxT1 can simply be done as SELECT * FROM S,T1                                     
  T ← T1 - T2                        
Andrey Morozov
  • 7,839
  • 5
  • 53
  • 75
Mahesha999
  • 22,693
  • 29
  • 116
  • 189

2 Answers2

3

You have an inaccuracy in the formula. In your particular case you need projection on B instead of projection on A otherwise how the B attribute would appear in the result?

  T1 ← πBR                
  T2 ← πB((S x T1) - R)                                     
  T ← T1 - T2  

The query:

-- T ← T1 - T2 -------------------------------------
SELECT DISTINCT B FROM R -- T1 ← πBR              --
WHERE B NOT IN (                                  --
    -- T2 ← πB((S x T1) - R) -------------------  --
    SELECT CR.B                               --  --
    FROM (                                    --  --
        -- (S x T1) --------------------      --  --
        SELECT S.A, R2.B              --      --  --
        FROM S                        --      --  --
        CROSS JOIN (                  --      --  --
            -- T1 ← πBR               --      --  --
            SELECT DISTINCT B FROM R  --      --  --
        ) AS R2                       --      --  --
        --------------------------------      --  --
    ) AS CR                                   --  --
    WHERE NOT EXISTS (                        --  --
        SELECT *                              --  --
        FROM R                                --  --
        WHERE R.A = CR.A AND R.B = CR.B       --  --
    )-------------------------------------------  --
)---------------------------------------------------

The result:

B
--
b1
b4

SQLFiddle

Andrey Morozov
  • 7,839
  • 5
  • 53
  • 75
  • Man need exact imitation. I mean you used aggregate functions in both solutions, while the relational equivalent have none. – Mahesha999 Jan 11 '15 at 07:02
  • @AndreyMorozov, have a look at my answer that compares execution plans. SQL Server optimizer is really smart. – Vladimir Baranov Jan 11 '15 at 09:45
  • @VladimirBaranov I tried to run both queries together (in one batch) to obtain relative costs between them and found 53% against 47% where first is my query and second is yours - still there is little difference – Andrey Morozov Jan 11 '15 at 09:53
  • but! :) if I remove `DISTINCT` from `CROSS JOIN` then relative costs changes to 42% against 58% – Andrey Morozov Jan 11 '15 at 10:00
  • @AndreyMorozov, how did you manage to run just these two queries in one batch? In my script there are a bunch of INSERTs, so SSMS shows a plan for each insert first and then plans for queries. You can see in my screenshot that queries have numbers 16 and 17, queries before them are INSERTs. I don't know how to show plans just for those queries in a batch that I need. I tried to run the script without any INSERTs at all, with empty tables. In this case there are only two plans (identical to screenshot) and both have 50%. – Vladimir Baranov Jan 11 '15 at 10:03
  • To compare relative costs of both queries you need to create R and S tables separately and then execute both queries together (like here sqlfiddle.com/#!3/7251e/2) + (screen http://grab.by/DGvu) – Andrey Morozov Jan 11 '15 at 10:16
  • @AndreyMorozov, ah, so you created permanent tables. I thought it was possible without it. In general, it may be a question in itself: I have several queries in the SSMS window that I want to execute together. I'd like to see execution plans only of some of them. – Vladimir Baranov Jan 11 '15 at 10:17
  • @VladimirBaranov I think this would be the offtopic for this question :) Let's stop pollute the air :) – Andrey Morozov Jan 11 '15 at 10:23
1

There is a misprint in the relational algebra formulas. Instead of πA there should be πB.

Create sample tables:

DECLARE @R TABLE (A varchar(10), B varchar(10));

INSERT INTO @R (A, B) VALUES ('a1','b1');
INSERT INTO @R (A, B) VALUES ('a2','b1');
INSERT INTO @R (A, B) VALUES ('a3','b1');
INSERT INTO @R (A, B) VALUES ('a4','b1');
INSERT INTO @R (A, B) VALUES ('a1','b2');
INSERT INTO @R (A, B) VALUES ('a3','b2');
INSERT INTO @R (A, B) VALUES ('a2','b3');
INSERT INTO @R (A, B) VALUES ('a3','b3');
INSERT INTO @R (A, B) VALUES ('a4','b3');
INSERT INTO @R (A, B) VALUES ('a1','b4');
INSERT INTO @R (A, B) VALUES ('a2','b4');
INSERT INTO @R (A, B) VALUES ('a3','b4');

DECLARE @S TABLE (A varchar(10));

INSERT INTO @S (A) VALUES ('a1');
INSERT INTO @S (A) VALUES ('a2');
INSERT INTO @S (A) VALUES ('a3');

Literal translation of the given relational algebra formulas would be like this (tried in SQL Server 2008). The x operator is CROSS JOIN, the - operator is EXCEPT.

WITH
T1
AS
(
    SELECT B
    FROM @R
)
,T2
AS
(
    SELECT SS.A, T1.B
    FROM
        @S AS SS CROSS JOIN T1

    EXCEPT

    SELECT RR.A, RR.B
    FROM @R AS RR
)
SELECT T1.B
FROM T1

EXCEPT

SELECT T2.B
FROM T2
;

Result set:

B
b1
b4

Interestingly, that execution plan for the variant that uses EXCEPT and variant by Andrey Morozov are almost identical. I like SQL Server query optimizer! Top plan uses EXCEPT (see the query above), bottom plan is without EXCEPT (see the query in answer by @Andrey Morozov, I adapted it here for my temp tables):

SELECT DISTINCT B FROM @R -- T1 ← πBR              --
WHERE B NOT IN (                                  --
    -- T2 ← πB((S x T1) - R) -------------------  --
    SELECT CR.B                               --  --
    FROM (                                    --  --
        -- (S x T1) --------------------      --  --
        SELECT S.A, R2.B              --      --  --
        FROM @S AS S                        --      --  --
        CROSS JOIN (                  --      --  --
            -- T1 ← πBR               --      --  --
            SELECT DISTINCT B FROM @R  --      --  --
        ) AS R2                       --      --  --
        --------------------------------      --  --
    ) AS CR                                   --  --
    WHERE NOT EXISTS (                        --  --
        SELECT *                              --  --
        FROM @R AS R                                --  --
        WHERE R.A = CR.A AND R.B = CR.B       --  --
    )-------------------------------------------  --
)-------------

top plan with EXCEPT, bottom without EXCEPT

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90