-1

I have the following three tables ,members,ageGroup and log. I would like to find how many people from each age group access the system for a given duration. Here are the table details

tblMembers:

Member_Id , DOB, Age

tblAgeGroup:

AgeGroup Id, AgeGroup_Des, minAge, maxAge, meanAge

tblLog:

Member_Id, AccessDate

So when the Admin selects a date range such as 01/04/2014 to 30/04/2014

I need show the output similar to the following:

AgeGroup No of Users

15-25 -------------- 3

25-35-------------- 2

How ever I am not able to do so. Here is what I have done so far

 SELECT Member_Id, Age
   FROM tblLog
        JOIN tblMembers ON tblMembers.Member_Id = tblLog.Member_Id
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30'
  GROUP BY Member_Id, Age

Based on the script, I am able to list the Members_Id and their age. How can I proceed on by grouping via the the group details found in the tblAgeGroup table.

Please do advice me. Thanks

Amarnath Balasubramanian
  • 9,300
  • 8
  • 34
  • 62
M.A
  • 1,073
  • 2
  • 15
  • 21

3 Answers3

1

Try

 SELECT ag.AgeGroup_Des, count(1)
   FROM tblLog l
        INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
        INNER JOIN tblAgeGroup ag ON ag.minAge <= m.Age and ag.maxAge > m.Age
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30'
  GROUP BY ag.AgeGroup_Des

However, when joining using an interval, as we are doing here on tblAgeGroup, it is important to test if we get the correct number of rows.

So I would run the following queries, and validate that they indeed do return the same number of rows:

 SELECT COUNT(1)
   FROM tblLog l
        INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
        --INNER JOIN tblAgeGroup ag ON ag.minAge <= m.Age and ag.maxAge > m.Age
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30'

and

 SELECT COUNT(1)
   FROM tblLog l
        INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
        INNER JOIN tblAgeGroup ag ON ag.minAge <= m.Age and ag.maxAge > m.Age
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30'

There are two things that come to my mind, which can go wrong when joining using intervals as here.

Firstly; If the intervals is not uniquely defined, each member would get returned several times. Eg consider tblAgeGroup containing ... , minAge = 15 , maxAge = 35 ... and another entry: ... , minAge = 20 , maxAge = 25 ...

A member with an age of 21 would thus get returned in both intervals, thus creating inflation in your count.

Secondly; If Age is not defined in tblMembers, the row will be excluded, but this is probably intended, since we are using inner joins. You could get around this, with following:

 SELECT ISNULL(ag.AgeGroup_Des, 'No group found') as AgeGroup_Des,
        COUNT(1) as "AMT users"
   FROM tblLog l
        INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
         LEFT JOIN tblAgeGroup ag ON ag.minAge <= m.Age and ag.maxAge > m.Age
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30'
  GROUP BY ISNULL(ag.AgeGroup_Des, 'No group found')
  ORDER BY 1

Edit: Added a SQLFiddle to test: http://sqlfiddle.com/#!6/137b8/5

Queries in the fiddle are:

Build:

CREATE TABLE tblMembers
(
    Member_ID int,
    DOB date,
    Age int
);

CREATE TABLE tblAgeGroup
(
    [AgeGroup Id] int,
    AgeGroup_Des varchar(16),
    minAge int,
    maxAge int,
    meanAge float
);

CREATE TABLE tblLog
(
    Member_id int,
    AccessDate date
);



INSERT INTO tblMembers
       (Member_ID, DOB, Age) 
SELECT 1, '1991-03-22', 23
 UNION ALL
SELECT 2, '2000-03-22', 14
 UNION ALL
SELECT 3, '1981-03-22', 33
 UNION ALL
SELECT 4, null, null;

INSERT INTO tblAgeGroup
       ([AgeGroup Id], AgeGroup_Des, minAge, maxAge, meanAge)
SELECT 1, '15-25', 15, 25, null
 UNION ALL
SELECT 2, '10-15', 10, 15, null
 UNION ALL
SELECT 3, '25-55', 25, 55, null;

INSERT INTO tblLog
       (Member_id, AccessDate)
SELECT 1, GETDATE()
 UNION ALL
SELECT 2, GETDATE()
 UNION ALL
SELECT 1, '2014-04-02'
 UNION ALL
SELECT 2, '2014-04-03'
 UNION ALL
SELECT 2, '2014-04-03'
 UNION ALL
SELECT 3, '2014-04-03'
 UNION ALL
SELECT 4, '2014-04-03';

Query:

-- First query: (exludes member 4, since no DOB is registered:
 SELECT ag.AgeGroup_Des, COUNT(1)
   FROM tblLog l
        INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
        INNER JOIN tblAgeGroup ag ON ag.minAge <= m.Age and ag.maxAge > m.Age
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30'
  GROUP BY ag.AgeGroup_Des;

-- To validate that the counts are the same:
 SELECT COUNT(1)
   FROM tblLog l
        INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
        --INNER JOIN tblAgeGroup ag ON ag.minAge <= m.Age AND ag.maxAge > m.Age
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30';

-- Second validation:
-- Notice: this does return one less row, since we are using inner join,
-- and thus ommitting member 4
 SELECT COUNT(1)
  FROM tblLog l
       INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
       INNER JOIN tblAgeGroup ag ON ag.minAge <= m.Age AND ag.maxAge > m.Age
 WHERE AccessDate >= '2014-04-01'
   AND AccessDate <= '2014-04-30';

-- Last query, which includes members with no DOB registered, 
-- as well as members that does not fall into any group:
 SELECT ISNULL(ag.AgeGroup_Des, 'No group found') as AgeGroup_Des,
        COUNT(1) as "AMT users"
   FROM tblLog l
        INNER JOIN tblMembers m ON m.Member_Id = l.Member_Id
         LEFT JOIN tblAgeGroup ag ON ag.minAge <= m.Age AND ag.maxAge > m.Age
  WHERE AccessDate >= '2014-04-01'
    AND AccessDate <= '2014-04-30'
  GRUOP BY ISNULL(ag.AgeGroup_Des, 'No group found')
  ORDER BY 1;

2014-05-14: Edit:

I see in your comments, that you also need a count for the empty groups. In order to do this, I've "inverted" the from clause, so that we select all from tblAgeGroup, and then left join the other tables.

Also, I assumed that you needed a count of log entries, but after reading comments, I see that you ask for distinct users. I've altered the query accordingly.

select 
  ag.AgeGroup_Des
, count(l.Member_ID)          as [AMT Log entries]
, count(distinct l.Member_ID) as [AMT distinct members]
from tblAgeGroup ag
left join tblMembers m on ag.minAge <= m.Age and ag.maxAge > m.Age
left join tblLog l on l.Member_id = m.Member_ID
group by isnull(ag.AgeGroup_Des, 'No group found')
order by 1
;

which returns

AgeGroup_Des    AMT Log entries AMT distinct members
10-15               3               1
15-25               2               1
25-55               1               1
55-65               0               0

However, this approach does ignores every member which doesn't have a DOB / age registered. If we want to include these aswell, we need to edit the query to a full outer join:

select 
  isnull(ag.AgeGroup_Des, 'No group found') as AgeGroup_Des
, count(l.Member_ID)          as [AMT Log entries]
, count(distinct l.Member_ID) as [AMT distinct members]
from tblAgeGroup ag
full outer join tblMembers m on ag.minAge <= m.Age and ag.maxAge > m.Age
full outer join tblLog l on l.Member_id = m.Member_ID
group by isnull(ag.AgeGroup_Des, 'No group found')
order by 1
;

which returns

AgeGroup_Des    AMT Log entries AMT distinct members
10-15               3               1
15-25               2               1
25-55               1               1
55-65               0               0
No group found      1               1

See updated fiddle here: http://sqlfiddle.com/#!6/d8733/5

  • When i run the first script you posted i get 0 records. – M.A May 12 '14 at 08:40
  • @GunnarKundsen, However your second script indicates 2301 records and the third script is givng 0 records. May I know why is that so. – M.A May 12 '14 at 08:41
  • Without knowing your dataset, I can't say why that is so, but I've created some test data, where the query performs as expected, see this fiddle: http://sqlfiddle.com/#!6/137b8/5 (I've edited this into the answer aswell). How does your dataset differ? – Gunnar Sjúrðarson Knudsen May 12 '14 at 08:56
  • Hmm after seeing your demo, It is exactly the same as mine. The only issue is that I have multiple rows of records with the same member id. This is becuase every time the user access any page it will log. So for each member Id I have abt 100 row. How can I make it distinct. I tried but it is not working. – M.A May 12 '14 at 09:42
  • 1
    Multiple rows of each user in tblLog should not be a problem - in fact Member_ID = 2 is present twice in my example and counts it accordingly. The fact that the third script gives 0 records, tells me that minAge and MaxAge in tblAgeGroup does not match Age given in tblMembers. Could you make a SQLFiddle of your three tables with data, so i can check it? – Gunnar Sjúrðarson Knudsen May 12 '14 at 09:50
  • what I mean is your duplicate record is counted. It should not count the duplicated record and also the age groups that has no records should indicate 0 for example 55-70 there is no record in the log so it should say 0 – M.A May 14 '14 at 01:46
  • I manage to do the distinct record but i still cant figure out for the issue where there is no record for a particular age group. Can you advice. Here is the new fiddle with the distinct in it. http://sqlfiddle.com/#!6/93b49/4 – M.A May 14 '14 at 02:25
  • Thanks alot, I now understand better. However i have alos managed to do in another way which I would like your advice on. You can have a look at http://stackoverflow.com/questions/23646142/convert-sql-with-multiple-join-into-linq – M.A May 14 '14 at 09:40
  • Please do give comments on which would be better. Futhermore , if you assist can we change this to Linq or Linq Lamda extension. I mean your query. – M.A May 14 '14 at 09:42
  • I'm sorry, but I know virtually nothing about LINQ - I'm sure someone smarter will help you with the other question. I guess both versions are useful, but (I) prefer joins instead of subqueries, when possible. Subqueries will be executed once for each row, whereas a join is only executed once. (Depending on the optimizer). – Gunnar Sjúrðarson Knudsen May 14 '14 at 09:51
0
 SELECT usr_type, SUM(_15_25 + _25_35) AS qty
   FROM
        (SELECT CASE 
                     WHEN  age>=15 AND age<25 THEN '15-25'
                     WHEN age>=25 AND age<=35 THEN '25-35'
                     ELSE 'other'
                 END AS usr_type, 
                CASE
                     WHEN age>= 15 AND age < 25 THEN 1
                     ELSE 0
                 END AS _15_25, 
                CASE
                     WHEN age >= 25 AND age <= 35 THEN 1
                     ELSE 0
                 END AS _25_35 
                 Age
           FROM tblLog
                JOIN tblMembers ON tblMembers.Member_Id = tblLog.Member_Id
          WHERE AccessDate >= '2014-04-01'
            AND AccessDate <= '2014-04-30'
        ) T
  GROUP BY usr_type

try this logical solution.

Kevin Hogg
  • 1,771
  • 25
  • 34
realnumber3012
  • 1,062
  • 6
  • 10
0
 SELECT minAge, maxAge, count(*) as "No of Users" 
   FROM tblAgeGroup g 
        INNER JOIN tblMembers m ON Age BETWEEN minAge AND maxAge
        INNER JOIN tblLog l ON m.Member_ID = l.Member_ID
  WHERE AccessDate BETWEEN '2014-04-01' and '2014-04-30'
  GROUP BY minAge, maxAge
  ORDER BY minAge, maxAge
Kevin Hogg
  • 1,771
  • 25
  • 34
SaadH
  • 1,158
  • 2
  • 23
  • 38