Running SQL Server 14.0.2037.2 on Windows 10 Enterprise version 21H1
I am trying to return IDs only when another columns entry across all rows within that ID matches the entries of a similar column in a second table, again within specific IDs. For example,
create table #F(patentid VARCHAR(8), subclass VARCHAR(3));
insert into #F values ('l','x');
insert into #F values ('l','y');
insert into #F values ('l','z');
insert into #F values ('m','x');
insert into #F values ('n','z');
create table #P(patentid VARCHAR(8), subclass VARCHAR(3));
insert into #P values ('b','x');
insert into #P values ('c','w');
insert into #P values ('c','x');
insert into #P values ('c','y');
insert into #P values ('c','z');
insert into #P values ('d','x');
insert into #P values ('d','y');
insert into #P values ('d','z');
insert into #P values ('e','x');
insert into #P values ('e','y');
insert into #P values ('e','z');
I am trying to write a query that will return, for each patentid in #F, all patentIDs in #P where the subclass entries match exactly. This true for "l" in #F that matches with "d" and "e" in #P [the matching subclasses are 'x', 'y' and 'z'] as well as for "m" in #F that matches with "b" in #P [the matching subclass is "x"]
the output should be:
l, d
l, e
m, b
The code needs to be efficient as both #F and #P contain millions of rows. I have tried Union, but that works with entire sets of rows and doesn't allow matching by a subset (i.e. within a patentid)
Any help much appreciated.