3

I've got a table with 1.5 million records on a SQL Server 2008. There's a varchar column 'ReferenzNummer' that is indexed.

The following query executed in the SQL Management Studio works and is fast:

SELECT v1.Id, v2.Id FROM Vorpapier as v1 cross join Vorpapier as v2
WHERE v1.ReferenzNummer LIKE '7bd48e26-58d9-4c31-a755-a15500bce4c4'
    AND v2.ReferenzNummer LIKE '7bd4%'

(I know the query does not make much sense like this, there will be more constraints, but that's not important for the moment)

Now I'd like to execute a query like this from Entity Framework 5.0, my LINQ looks like this:

var result = (from v1 in vorpapierRepository.DbSet
              from v2 in vorpapierRepository.DbSet
              where v1.ReferenzNummer == "7bd48e26-58d9-4c31-a755-a15500bce4c4" &&
                  v2.ReferenzNummer.StartsWith("7bd4")
              select new { V1 = v1.Id, V2 = v2.Id })
            .Take(10)
            .ToList();

This tries to load the whole table into memory, leading to an OutOfMemoryException after some time. I've tried to move the WHERE parts, with no success:

var result = (from v1 in vorpapierRepository.DbSet.Where(v => v.ReferenzNummer == "7bd48e26-58d9-4c31-a755-a15500bce4c4")
              from v2 in vorpapierRepository.DbSet.Where(v => v.ReferenzNummer.StartsWith("7bd4"))
                        select new { V1 = v1.Id, V2 = v2.Id })
                        .Take(10)
                        .ToList();

Is it possible to tell Entity Framework to create a cross join statement, like the one I've written myself?

UPDATE 1

The EF generated SQL looks like this (for both queries)

SELECT [Extent1].[Id]             AS [Id],
     [Extent1].[VorpapierArtId] AS [VorpapierArtId],
     [Extent1].[ReferenzNummer] AS [ReferenzNummer],
     [Extent1].[IsImported]     AS [IsImported],
     [Extent1].[DwhVorpapierId] AS [DwhVorpapierId],
     [Extent1].[Datenbasis_Id]  AS [Datenbasis_Id]
FROM   [dbo].[Vorpapier] AS [Extent1]

UPDATE 2

When I change the LINQ query and join the table with itself on the field DatenbasisIDd (which is not exactly what I want, but it might work), EF creates a join:

        var result = (from v1 in vorpapierRepository.DbSet 
                      join v2 in vorpapierRepository.DbSet
                          on v1.DatenbasisId equals v2.DatenbasisId
                      where v1.ReferenzNummer == "7bd48e26-58d9-4c31-a755-a15500bce4c4" && v2.ReferenzNummer.StartsWith("7bd4")
                        select new { V1 = v1.Id, V2 = v2.Id })
                        .Take(10)
                        .ToList();

The resulting SQL query looks like this. It works and is fast enough.

SELECT TOP (10) 1              AS [C1],
            [Extent1].[Id] AS [Id],
            [Extent2].[Id] AS [Id1]
FROM   [dbo].[Vorpapier] AS [Extent1]
   INNER JOIN [dbo].[Vorpapier] AS [Extent2]
     ON ([Extent1].[Datenbasis_Id] = [Extent2].[Datenbasis_Id])
         OR (([Extent1].[Datenbasis_Id] IS NULL)
             AND ([Extent2].[Datenbasis_Id] IS NULL))
WHERE  (N'7bd48e26-58d9-4c31-a755-a15500bce4c4' = [Extent1].[ReferenzNummer])
   AND ([Extent2].[ReferenzNummer] LIKE N'7bd4%')

I still don't see, why EF doesn't create the cross join in the original query. Is it simply not supported?

Stefan
  • 121
  • 7
  • You must show your `VorpapierRepository` implementation. I assume the problem in repository implementation, because you LINQ look like correct. – Hamlet Hakobyan Feb 04 '13 at 16:34
  • @hamlet-hakobyan: The `vorpapierRepository` returns the EF DbSet. Therefore the implementation of that repository should not matter. – delixfe Feb 04 '13 at 16:47
  • @delixfe You'll can assume only. – Hamlet Hakobyan Feb 04 '13 at 17:44
  • vorpapierRepository.DbSet is an EF IDbSet, I've temporarily added it for testing purposes. – Stefan Feb 04 '13 at 19:15
  • What SQL is generated? Does it contain the predicates? – Gert Arnold Feb 04 '13 at 21:36
  • The SQL is basically a SELECT * FROM Vorpapier... which then tries to load all the 1.5 million records to the client. – Stefan Feb 05 '13 at 07:11
  • Which linq query generates SQL from your update? – Hamlet Hakobyan Feb 05 '13 at 07:23
  • Both queries generate the same SQL – Stefan Feb 05 '13 at 07:56
  • That indicates that in `vorpapierRepository.DbSet` a conversion to IEnumerable takes place. I think it's getting relevant to show its code. – Gert Arnold Feb 05 '13 at 08:40
  • That's all there is: private readonly IDbSet entityCollection; public IDbSet DbSet { get { return entityCollection; } } public Repository(IDbSet entityCollection) { Contract.Requires(entityCollection != null); this.entityCollection = entityCollection; } It's just the EF DbSet. If I create a query without adding the table twice (only one from), it's executed on the DB. – Stefan Feb 05 '13 at 09:11

1 Answers1

0

If you use a join in the linq statement it will get passed back to SQL Server. Here are some examples of the join operator in linq: http://code.msdn.microsoft.com/LINQ-Join-Operators-dabef4e9

groksrc
  • 2,910
  • 1
  • 29
  • 29