0

disclaimer : I don't have full control over the db schema don't judge the data structure or the naming conventions :)

I am doing this large query with multiple joins :

SELECT TOP 30   
iss.iss_lKey as IssueId,  
iss.iss_sName as IssueName,  
con.con_lKey as ContainerId,  
con.con_sName as ContainerName,  
sto.sto_lKey as StoryId,   
sto.sto_sName as StoryName,  
sto.sto_Guid as StoryGuid,  
sto.sto_sByline as Byline,  
sto.sto_created_dWhen as StoryCreatedDate,  
sto.sto_deadline_dWhen as StoryDeadline,  
sto.sto_lType as StoryType,  
sto.sto_sct_lKey as StoryCategory,  
sto.sto_created_use_lKey as CreatedBy,  
sfv.sfv_tValue as FieldValue,  
sf.sfe_lKey as StoryFieldId,  
sf.sfe_sCaption as StoryFieldCaption,   
sre.sre_lIndex as RevisionIndex  
FROM tStory30 sto  
JOIN tContainer30 con ON sto.sto_con_lKey = con.con_lKey  
JOIN tIssue30 iss ON con.con_iss_lKey = iss.iss_lKey  
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey  
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey  
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey  
WHERE sre.sre_lIndex = 0  
AND (sto.sto_sName LIKE '%' + @0 + '%'  
OR sfv.sfv_tValue LIKE '%' + @0 + '%')";

What I need is really only one row by StoryId, that includes the FieldValue that matched if there was any. I am currently grouping in the code to produce the output, but that prevents me from paging the results.

from r in items
   group r by new { r.StoryId, r.ContainerId, r.IssueId }
   into storyGroup
   select {
       storyGroup.Key.StoryId,
       storyGroup.Key.ContainerId,
       storyGroup.Key.IssueId,
       Hits = storyGroup.ToList()
    }

Is there any way to achieve this kind of grouping in sql, so that I could then page the result properly (using ROW_NUMBER() OVER)?

Also, I am aware that this is bad practice and should use FullText search. it is planned to setup a solr instance, or use the fulltext options in sqlserver. This is a first attempt to get a smthg going.

EDIT

trying to explain verbally what I try to achieve :

For the context, our app is a cms for magazine editor/publisher.

for a given magazine they have many Issues
each issue has many Container (sort of logical article group)
in each container you have several stories
a story van have 0 or many revisions
the fields of a story are stored by revision (many field per revision)
and a field has a field value.

I need to retrieve the stories that have a given text in the name or in a field value of the first revision (that's the where revisionIndex = 0). but I also need to retrieve associated data for each story. (issueId, name, containerId and name, and so one..) the difficult one is probably to retrieve one of the fieldvalue that matched the search. I don't need all of them, just one...

hope this helps!

EDIT Sample data searching for "test". I simplified the columns to make it easier to understand.

Row | IssueId  |  IssueName    |    ContainerId  |  StoryId    |    FieldValue  
1   |   11       IssueName A          394             868          Test Marsupilami bla bla youpi
2   |   40       IssueName B          6               631          story save test
3   |   40       IssueName B          6               666          test story
4   |   4        IssueName c          30              846          test abs
5   |   4        IssueName c          30              846          absc test
6   |   4        IssueName c          30              846          hello test

I am able to get the row number in sqlserver on my query, but here, as you see, I get amultiple times the same story. In this case, I could have simple the following result:

Row | IssueId  |  IssueName    |    ContainerId  |  StoryId    |    FieldValue  
1   |   11       IssueName A          394             868          Test Marsupilami bla bla youpi
2   |   40       IssueName B          6               631          story save test
3   |   4        IssueName c          30              846          test abs

if a story would have test in the story name, then I am ok with a null value in the column FieldValue which field value is selected doesn't matter much.

Stéphane
  • 11,755
  • 7
  • 49
  • 63
  • You said you want only one row per `StoryId`, does that mean that you can get more than one row that actually matches `FieldValue`. If so, how do we choose the right result?, what is the criteria we should follow? – Lamak Sep 20 '11 at 14:29
  • I know that this is where the problem lies... I retrieving the first field value that matched the sfv.sfv_tValue LIKE '%@0%' clause would be enough.. – Stéphane Sep 20 '11 at 14:32
  • Ok, But what makes it the "first value"?, do we order by `FieldValue` alphabetically? – Lamak Sep 20 '11 at 14:49
  • that doesn't really matter in fact. getting any of the fieldvalue that contains the searched value would be enough. – Stéphane Sep 20 '11 at 14:54

1 Answers1

0

This is a digression but are you aware that you have converted a left join to an inner join?

LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey   
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey   
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey   
WHERE sre.sre_lIndex = 0

try this instead

LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey   
    AND sre.sre_lIndex = 0
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey   
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey   

(I would have done this in a comment but it is easier to see the code change here.

HLGEM
  • 94,695
  • 15
  • 113
  • 186
  • ok, thanks for that. the thinking here is that there are stories without revisions that are match the storyname where clause. and I do want these. – Stéphane Sep 20 '11 at 14:20
  • 1
    This explains in more detail: http://wiki.lessthandot.com/index.php/WHERE_conditions_on_a_LEFT_JOIN – HLGEM Sep 20 '11 at 14:37