I have the following SQL problem, in which I am actually looking for the most efficient way to do this. Suppose I have the following table:
r|Company - StartPeriod - EndPeriod - NumberOfEmployees - NumberOfMachines ...
1|A - 2014-1-1 - 2014-2-5 - 30 - 1
2|A - 2014-2-6 - 2014-3-7 - 30 - 1
3|A - 2014-3-8 - 2014-4-9 - 31 - 1
4|A - 2014-4-9 - 2014-5-1 - 32 - 1
5|B - 2014-1-2 - 2014-2-1 - 20 - 2
6|B - 2014-2-2 - 2014-3-1 - 20 - 2
7|B - 2014-3-2 - 2014-4-1 - 20 - 2
Notice the the redundant amount of data? This is because the companies get sampled now and then and therefore the periods are actually kind of random in lengths.
Now the question is, how can merge the rows if and only if the variables NumberOfEmployees, NumberOfMachines etcetera are the same, but StartPeriod and Endperiod differ? Of course I would like to replace the start period with the earliest one and the endperiod with the latest one.
The r is just an indicator, not a variable in the database. In this case, 1 and 2 will be merged, but 3,4 will be kept, since it has different values (namely it has 31 employees instead of 30).
This is in my opinion such an hard problem, that I am not even sure whether SQL can do this.
The resulting output would be:
r|Company - StartPeriod - EndPeriod - NumberOfEmployees - NumberOfMachines ...
1|A - 2014-1-1 - 2014-3-7 - 30 - 1
2|A - 2014-3-8 - 2014-4-9 - 31 - 1
3|A - 2014-4-9 - 2014-5-1 - 32 - 1
4|B - 2014-1-2 - 2014-4-1 - 20 - 2
Thanks for everything!