4

Relational division is one of Codd's primitive relational operators, colloquially known as the suppliers who supply all parts. There have been various translations into SQL e.g. Celko discusses several approaches using the example of the pilots who can fly all the planes in the hangar.

The one I prefer is "division with set operators" because it is "with remainder" (i.e. Wilson can also fly a F-17 Fighter but there isn't one in the hangar) and how it handles the case when the divisor is the empty set (i.e. when the hangar is empty then all pilots are returned):

WITH PilotSkills
     AS
     (
      SELECT * 
        FROM (
              VALUES ( 'Celko', 'Piper Cub' ), 
                     ( 'Higgins', 'B-52 Bomber' ), ( 'Higgins', 'F-14 Fighter' ),
                     ( 'Higgins', 'Piper Cub' ), 
                     ( 'Jones', 'B-52 Bomber' ), ( 'Jones', 'F-14 Fighter' ),
                     ( 'Smith', 'B-1 Bomber' ), ( 'Smith', 'B-52 Bomber' ),
                     ( 'Smith', 'F-14 Fighter' ),
                     ( 'Wilson', 'B-1 Bomber' ), ( 'Wilson', 'B-52 Bomber' ),
                     ( 'Wilson', 'F-14 Fighter' ), ( 'Wilson', 'F-17 Fighter' )
             ) AS T ( pilot_name, plane_name )
     ), 
     Hangar
     AS
     (
      SELECT * 
        FROM (
              VALUES ( 'B-1 Bomber' ), 
                     ( 'B-52 Bomber' ), 
                     ( 'F-14 Fighter' )
             ) AS T ( plane_name )
     )
SELECT DISTINCT pilot_name 
  FROM PilotSkills AS P1
 WHERE NOT EXISTS (
                   SELECT plane_name 
                     FROM Hangar
                   EXCEPT
                   SELECT plane_name
                     FROM PilotSkills AS P2
                    WHERE P1.pilot_name = P2.pilot_name
                  );

Now I need to do this in LINQ to Objects. Here's a suggested direct translation:

var hangar = new [] 
{ 
    new { PlaneName = "B-1 Bomber" },
    new { PlaneName = "F-14 Fighter" },
    new { PlaneName = "B-52 Bomber" }
}.AsEnumerable();

var pilotSkills = new [] 
{ 
    new { PilotName = "Celko", PlaneName = "Piper Cub" },
    new { PilotName = "Higgins", PlaneName = "B-52 Bomber" },
    new { PilotName = "Higgins", PlaneName = "F-14 Fighter" },
    new { PilotName = "Higgins", PlaneName = "Piper Cub" },
    new { PilotName = "Jones", PlaneName = "B-52 Bomber" },
    new { PilotName = "Jones", PlaneName = "F-14 Fighter" },
    new { PilotName = "Smith", PlaneName = "B-1 Bomber" },
    new { PilotName = "Smith", PlaneName = "B-52 Bomber" },
    new { PilotName = "Smith", PlaneName = "F-14 Fighter" },
    new { PilotName = "Wilson", PlaneName = "B-1 Bomber" },
    new { PilotName = "Wilson", PlaneName = "B-52 Bomber" },
    new { PilotName = "Wilson", PlaneName = "F-14 Fighter" },
    new { PilotName = "Wilson", PlaneName = "F-17 Fighter" }
}.AsEnumerable();

var actual = pilotSkills.Where
(
    p1 => hangar.Except
    ( 
        pilotSkills.Where( p2 => p2.PilotName == p1.PilotName )
                    .Select( p2 => new { p2.PlaneName } )
    ).Any() == false 
).Select( p1 => new { p1.PilotName } ).Distinct();

var expected = new [] 
{
    new { PilotName = "Smith" },
    new { PilotName = "Wilson" }
};

Assert.That( actual, Is.EquivalentTo( expected ) );

As LINQ is supposedly based on the relational algebra then a direct translation seems reasonable. But is there a better 'native' LINQ approach?


Reflecting on @Daniel Hilgarth's answer, in .NET Land the data is likely to be 'grouped' to begin with:

var pilotSkills = new [] 
{ 
    new { PilotName = "Celko", 
            Planes = new [] 
            { new { PlaneName = "Piper Cub" }, } },
    new { PilotName = "Higgins", 
            Planes = new [] 
            { new { PlaneName = "B-52 Bomber" }, 
              new { PlaneName = "F-14 Fighter" }, 
              new { PlaneName = "Piper Cub" }, } },
    new { PilotName = "Jones", 
            Planes = new [] 
            { new { PlaneName = "B-52 Bomber" }, 
              new { PlaneName = "F-14 Fighter" }, } },
    new { PilotName = "Smith", 
            Planes = new [] 
            { new { PlaneName = "B-1 Bomber" }, 
              new { PlaneName = "B-52 Bomber" }, 
              new { PlaneName = "F-14 Fighter" }, } },
    new { PilotName = "Wilson", 
            Planes = new [] 
            { new { PlaneName = "B-1 Bomber" }, 
              new { PlaneName = "B-52 Bomber" }, 
              new { PlaneName = "F-14 Fighter" }, 
              new { PlaneName = "F-17 Fighter" }, } },
};

...and projecting just the name is arbitrary, making the potential solutions much more straightforward:

 // Easy to understand at a glance:
var actual1 = pilotSkills.Where( x => hangar.All( y => x.Planes.Contains(y) ));

// Potentially more efficient:
var actual = pilotSkills.Where( x => !hangar.Except( x.Planes ).Any() );
onedaywhen
  • 55,269
  • 12
  • 100
  • 138
  • You don't need the `AsEnumerable()`, arrays are IEnumerable – Bob Vale Apr 25 '13 at 08:21
  • Codd's division isn't primitive. And informally it returns "the suppliers who supply all parts" *and* who supply at least one part. (Though I'd agree that your informal phrasing is sadly the common one.) So it's not actually that useful. The "approaches" aren't "translations" of the original, they are translations of the original *and different but reminiscent* operators. (Best/easiest to reason in terms of relational subset operators then convert to algebra/calculus.) PS "LINQ is supposedly based on the relational algebra"--ha ha. Inspired by, OK. – philipxy Dec 14 '17 at 09:04
  • @philipxy Quote from the article I cited: "This article describes monads and LINQ as a generalization of the relational algebra". If you have a citation for a refutation I'd be extremely interested. Similarly, I didn't mean to imply division is primitive and don't think I have; the article I linked to refers (somewhat informally) to Todd's and Romley’s. But if you have something for me to read about why division isn't a primitive operator then it would be gratefully received. – onedaywhen Jan 09 '18 at 08:32

2 Answers2

5

The following should yield the same result:

pilotSkills.GroupBy(x => x.PilotName, x => x.PlaneName)
           .Where(g => hangar.All(y => g.Contains(y.PlaneName)))

This will return one group per pilot that can fly all planes in the hangar.
The key of the group is the pilot's name, the content of the group are all the planes the pilot can fly, even those that aren't in the hangar.

If you now just want the pilots, you can add a .Select(g => new { PilotName = g.Key }) to the end of the query.


Using the above approach with Except, making it closer to the OP's original:

pilotSkills.GroupBy(x => x.PilotName, x => new { x.PlaneName } )
           .Where(g => !hangar.Except(g).Any());

This second query is potentially better, because it iterates g only once; the first query with Contains iterates it N times with N being the number of planes in the hangar.

Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
  • *$(*%£" Just beat me to the answer, however you do need a select on the end to achieve exactly the same result as the OP... – Bob Vale Apr 25 '13 at 08:33
  • 1
    @BobVale: Yeah, I beat you by like 10 minutes :p And yes, I already added the select. To me it was more important to demonstrate that all the planes are returned, not just those that are in the hangar. – Daniel Hilgarth Apr 25 '13 at 08:34
  • @BobVale: I hate it when this happens to me... :) – Daniel Hilgarth Apr 25 '13 at 08:35
  • @BobVale: BTW: Your answer was incorrect. You had it switched around. You returned those polits for who every plane they can fly exists in the hangar. E.g. you returned `Jones`. He can't fly all planes that are in the hangar. But all planes that he can fly are there. – Daniel Hilgarth Apr 25 '13 at 08:44
  • I twigged that, that's why I deleted mine as you had given pretty much the same answer just with the condition the correct way round – Bob Vale Apr 25 '13 at 09:38
  • Ah-ha, `GroupBy` simplifies things considerably (not used it much in LINQ). I've added my own answer based on your approach but closer to my original, what do you think? – onedaywhen Apr 25 '13 at 10:04
1

Using the approach in @Daniel Hilgarth's answer but closer to my original:

var actual = pilotSkills
                 .GroupBy(x => x.PilotName, x => new { x.PlaneName })
                 .Where(g => !hangar.Except(g).Any())
                 .Select(x => new { PilotName = x.Key });
onedaywhen
  • 55,269
  • 12
  • 100
  • 138
  • Your solution with `Except` is potentially better, because it iterates `g` only once. My solution with `Contains` iterates it N times with N being the number of planes in the hangar. – Daniel Hilgarth Apr 25 '13 at 10:17