6

I have this table named people with two dates on MySQL:

| Name | start_date | end_date   |
| John | 2007-03-01 | 2009-10-12 |
| Mike | 2001-06-06 | 2010-12-01 |

I want to create a view that lets me search by activity year, being activity year any year between the start_date and the end_date. So, I'd like to get a field with a sequence of years, like this:

| Name | activity_years                                    |
| John | 2007,2008,2009                                    |
| Mike | 2001,2002,2003,2004,2005,2006,2007,2008,2009,2010 |

I've tried some approaches, but I can't get it. Since I want to create a view, I have to do it everything inside a SELECT statement and that is giving me some headache.

Jorge Suárez de Lis
  • 565
  • 1
  • 10
  • 29
  • Nothing that worths mention. Tried some loops inside the SELECT, and creating a procedure that created the sequence, but I didn't get it to work, probably because I was missing something. – Jorge Suárez de Lis Apr 23 '13 at 08:49

2 Answers2

5

Something like this should do it:-

SELECT a.Name, GROUP_CONCAT(YEAR(DATE_ADD(a.start_date, INTERVAL b.aNum YEAR))) AS activity_years  
FROM person a
CROSS JOIN (SELECT a.i + b.i * 10 AS aNum FROM integers a, integers b) b
WHERE YEAR(DATE_ADD(a.start_date, INTERVAL b.aNum YEAR)) <= YEAR(a.end_date)
GROUP BY a.Name

It relies on a table of integers with a column called i, with the values 0 to 9. It joins this against itself to get a range of numbers from 0 to 99, so copes with date ranges that far apart.

Removing the subselects to use it in a view

SELECT p.Name, GROUP_CONCAT(YEAR(DATE_ADD(p.start_date, INTERVAL (a.i + b.i * 10) YEAR))) AS activity_years  
FROM person p
CROSS JOIN integers a
CROSS JOIN integers b
WHERE YEAR(DATE_ADD(p.start_date, INTERVAL (a.i + b.i * 10) YEAR)) <= YEAR(p.end_date)
GROUP BY p.Name
Kickstart
  • 21,403
  • 2
  • 21
  • 33
  • If OP finds it's acceptable to create new permanent helping table it might be better to just call it Years and store all possible years in it (1980-2050 for example) and just JOIN on that table to get data? – Nenad Zivkovic Apr 23 '13 at 08:59
  • Fair point and possibly true. Depends on date ranges, how often it is used, etc. For example if the date ranges were for the birth date and death date of ancestors going back a thousand years it might be a bit less flexible. Also if it were to be expanded to give all the months and years in the range then it is easy to do with the above. – Kickstart Apr 23 '13 at 09:05
  • Edited to remove the need for a subselect – Kickstart Apr 23 '13 at 10:36
  • Thank you, I've ended up using the approach without the subquery. However, I changed the syntax to do the condition on the JOIN clause, what boosts the query performance since only the needed number of rows are joined. Also, 10 years is enough for me, the maximum period will be probably 5 years, so 100 years is way too much. This is my version: `SELECT p.name, group_concat(distinct year(p.start_date + interval a.i year) separator ',') AS activity_years FROM person p LEFT JOIN integers a ON (p.start_date + interval a.i year) <= p.end_date;` – Jorge Suárez de Lis Apr 23 '13 at 11:26
  • 1
    Should do it. It would expect the Mysql optimiser to stop there being any difference in performance from putting the check in the ON clause or on the WHERE clause on a simple query like this. The LEFT JOIN can be changed to an INNER JOIN which would probably be a touch quicker. – Kickstart Apr 23 '13 at 11:31
4

SQLFiddle demo

select name,
       group_concat(tYears.row ORDER BY tYears.row)
  from people

join

(
SELECT @row := @row + 1 as row FROM 
(select 0 union all select 1 union all select 3 union all select 4 union all 
  select 5 union all select 6 union all select 6 union all select 7) t,
(select 0 union all select 1 union all select 3 union all select 4 union all 
  select 5 union all select 6 union all select 6 union all select 7) t2, 
(select 0 union all select 1 union all select 3 union all select 4 union all 
  select 5 union all select 6 union all select 6 union all select 7) t3, 

(SELECT @row:=1900) t21
) tYears

on tYears.row between year(start_date) and year(end_date)
group by name
valex
  • 23,966
  • 7
  • 43
  • 60
  • I don't fully understand this, but I can't get it to run anyways. It seems to work properly on a plain SELECT, but I can't put this SELECT into a VIEW. I get this error: `Error 1349: View's SELECT contains a subquery in the FROM clause SQL Statement` – Jorge Suárez de Lis Apr 23 '13 at 10:18
  • 2
    It is quite simple. It is cross joining the results of a few selects together to get a large number of results (each of the selects is bringing back 8 rows, so 512 rows returned), and joining that against a select for an initial variable (ie @row) set to 1900. It then brings that variable back once for each row, adding 1 each time. Hence giving a range from 1901 to 2413. It then discards any occurance of that variable which isn't between the year of the start data and the end date, and then uses GROUP_CONCAT to bring the remaining years back in a single field. – Kickstart Apr 23 '13 at 10:41
  • Thank you for the explanation, however this still won't work because the subquery limitation on views. – Jorge Suárez de Lis Apr 23 '13 at 11:17