0

I have two queries that work perfectly:

DECLARE @StartDate DATETIME = '2021-11-01 00:00:00';
DECLARE @EndDate DATETIME = '2022-03-16 23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;

SELECT 
    DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth, 
    DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber, 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear,
    COUNT(TransactionId) AS TransactionCount
FROM 
    GeneralJournal 
WHERE 
    GeneralJournal.[TransactionDate] >= @StartDate 
    AND GeneralJournal.[TransactionDate] <= @EndDate 
    AND MasterRecord = 1 
    AND TransactionTypeId = @SalesEstimateTransactionTypeId
GROUP BY 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]), 
    DATEPART(mm, GeneralJournal.[TransactionDate]),  
    DATENAME(mm,GeneralJournal.[TransactionDate]);

SELECT 
    DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth, 
    DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber, 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear, 
    COUNT(DISTINCT TransactionId) AS ConversionCount
FROM 
    GeneralJournal 
WHERE 
    GeneralJournal.[TransactionDate] >= @StartDate 
    AND GeneralJournal.[TransactionDate] <= @EndDate 
    AND MasterRecord = 0 
    AND TransactionTypeId = @SalesOrderTransactionTypeId 
    AND SEReferenceId > 0
GROUP BY 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]), 
    DATEPART(mm, GeneralJournal.[TransactionDate]),  
    DATENAME(mm,GeneralJournal.[TransactionDate]);

Note that the second query returns distinct because it can return multiple values and we only want to count each TransactionId once in that scenario. These return the following results:

ReportingMonth MonthNumber ReportingYear TransactionCount
November 11 2021 82
December 12 2021 49
January 1 2022 64
February 2 2022 67
March 3 2022 49
ReportingMonth MonthNumber ReportingYear ConversionCount
November 11 2021 42
December 12 2021 27
January 1 2022 31
February 2 2022 50
March 3 2022 24

I actually need to combine them like this:

ReportingMonth MonthNumber ReportingYear TransactionCount ConversionCount
November 11 2021 82 42
December 12 2021 49 27
January 1 2022 64 31
February 2 2022 67 50
March 3 2022 49 24

I have tried pretty much everything I can think of - Unions, Joins, Subqueries - but so far nothing is quite right. This is the closest I can get:

SELECT 
    DATENAME(mm, GeneralJournal.[TransactionDate]) AS ReportingMonth, 
    DATEPART(mm, GeneralJournal.[TransactionDate]) AS MonthNumber, 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]) AS ReportingYear, 
    SUM(CASE 
            WHEN TransactionTypeId = @SalesEstimateTransactionTypeId 
                 AND MasterRecord = 1 
               THEN 1 ELSE 0 
        END) AS TransactionCount, 
    COUNT(CASE 
              WHEN TransactionTypeId = @SalesOrderTransactionTypeId  
                   AND SEReferenceId > 0 THEN 1 
          END) AS ConversionCount
FROM 
    GeneralJournal 
WHERE 
    GeneralJournal.[TransactionDate] >= @StartDate 
    AND GeneralJournal.[TransactionDate] <= @EndDate 
    AND TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
GROUP BY 
    DATEPART(yyyy, GeneralJournal.[TransactionDate]), 
    DATEPART(mm, GeneralJournal.[TransactionDate]),    
    DATENAME(mm,GeneralJournal.[TransactionDate]);

However, I am unable to find a way to get a Distinct value for the ConversionCount. As a result it is returning the full count:

ReportingMonth MonthNumber ReportingYear TransactionCount ConversionCount
November 11 2021 82 152
December 12 2021 49 67
January 1 2022 64 101
February 2 2022 67 136
March 3 2022 49 64

Can anyone guide me towards a way to combine the two query results whilst maintaining the Distinct on the conversion count? I must add that for it to work the answer must be compatible with both SQL Server and VistaDB the syntax of which is a subset of T-SQL because I am obliged to support both database engines with the same query.

EDIT - The Final Solution

Following on from Nick's excellent answer I was able embed the solution into my existing query code to ensure that there are results even for months with no records, shown here in case it helps anyone else:

DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-10-31T23:59:59';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;

DECLARE @CurrentDate DATETIME;
DECLARE @Months TABLE(ReportingYear INT, MonthNumber INT, ReportingMonth VARCHAR (40));

-- Set the initial date
SET @CurrentDate = @StartDate
-- insert all dates into temp table
WHILE @CurrentDate <=  @EndDate
BEGIN
    INSERT INTO @Months VALUES(DATEPART(year, @CurrentDate), DATEPART(month, @CurrentDate), DATENAME(mm, @CurrentDate))
    SET @CurrentDate = dateadd(mm, 1, @CurrentDate)
END;

SELECT ReportingMonth, ReportingYear, Coalesce(TransactionCount, 0) AS TransactionCount, Coalesce(ConversionCount,0) AS ConversionCount
FROM
(
    SELECT months.[ReportingMonth], months.[ReportingYear], conversionData.[TransactionCount], conversionData.[ConversionCount]
    FROM @Months months
    LEFT JOIN
    (
        SELECT
        ReportingMonth      = DATENAME(mm, GeneralJournal.[TransactionDate]),
        MonthNumber         = DATEPART(mm, GeneralJournal.[TransactionDate]),
        ReportingYear       = DATEPART(yyyy, GeneralJournal.[TransactionDate]),
        TransactionCount    = SUM(CASE WHEN TransactionTypeId = @SalesEstimateTransactionTypeId AND GeneralJournal.[MasterRecord] = 1 THEN
                                        1
                                    ELSE
                                        0
                                END
                            ),
        ConversionCount     = COUNT(DISTINCT CASE WHEN GeneralJournal.[TransactionTypeId] = @SalesOrderTransactionTypeId
                                        AND GeneralJournal.[SEReferenceId] > 0
                                        AND GeneralJournal.[MasterRecord] = 0 THEN
                                        GeneralJournal.[TransactionID]
                                END
                            )
        FROM GeneralJournal
        WHERE GeneralJournal.[TransactionDate] >= @StartDate
            AND GeneralJournal.[TransactionDate] <= @EndDate
            AND GeneralJournal.[TransactionTypeId] IN ( @SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
        GROUP BY
            DATEPART(yyyy, GeneralJournal.[TransactionDate]),
            DATEPART(mm, GeneralJournal.[TransactionDate]),
            DATENAME(mm, GeneralJournal.[TransactionDate])
    ) as conversionData
    ON months.[ReportingYear] = conversionData.[ReportingYear] AND months.[MonthNumber] = conversionData.[MonthNumber]
) AS data;
oldcoder
  • 342
  • 2
  • 12
  • Aside... _I have two queries that work perfectly_ ... in very specific circumstances. Trying to use `'2021-11-01 00:00:00'` for a datetime will yield 01-Nov-2021 when `set dateformat mdy` is in effect, but 11-Jan-2021 when `set dateformat dmy` is in effect. If you want to be dateformat-agnostic you'll need to use the `T` separator, e.g.: `'2021-11-01T00:00:00'`. – AlwaysLearning Mar 18 '22 at 13:47
  • Don't use inclusive upper boundary for datetime ranges - refer to [dating responsibly](https://sqlblog.org/dates) – SMor Mar 18 '22 at 13:58
  • Can you provide some example detail data for us to work with? – Nick Fotopoulos Mar 18 '22 at 16:15
  • Hi @AlwaysLearning, Thanks for the tips. The circumstances are a little difficult, since the dates are passed from the application in the formats listed based on accounting date range data and parameterised for the real query. It passes ToStartOfDay and ToEndOfDay on the start and end date, and that granularity is too hard for SQL Server to handle so it has to be tweaked. You are correct though in everything you say and I really appreciate the pointers. – oldcoder Mar 19 '22 at 14:46
  • HI @SMor, Thanks for the advice. I am working with supplied values that have an inclusive boundary for human review and are passed as parameters to the value. I will look again to see if that can be changed but it would have to be done in the calling code e.g. add one milisecond to EndOfDay to get to StartOfDay for the next day to to as you recommend. – oldcoder Mar 19 '22 at 14:52

2 Answers2

2

You can just put the two columns in the same query. It is made more complicated by the fact that the WHERE clauses are slightly different. SO you need to group, then group again, and use conditional aggregation to count the right rows for each column.

Note the following:

  • You could in theory do COUNT(DISTINCT CASE however that is normally slower as the compiler will not recognize what the CASE is doing and instead do a full sort.
  • It is faster to group by a single EOMONTH calculation to group by a whole month. You can pull out the year and month in the SELECT.
  • COUNT(TransactionId) will return the number of non-null TransactionId values. if TransactionId cannot be null then COUNT(*) is the same thing.
  • If TransactionDate has a time component then you should use a half-open interval >= AND <
  • Use aliases on tables, it makes your queries more readable.
  • Use whitepsace, it's free.
DECLARE @StartDate DATETIME = '2021-11-01T00:00:00';
DECLARE @EndDate DATETIME = '2022-03-17T00:00:00';
DECLARE @SalesEstimateTransactionTypeId INT = 16;
DECLARE @SalesOrderTransactionTypeId INT = 15;

SELECT
  DATENAME(month, gj.mth) AS ReportingMonth,
  DATEPART(month, gj.mth) AS MonthNumber,
  DATEPART(year , gj.mth) AS ReportingYear,
  SUM(TransactionCount) AS TransactionCount,
  COUNT(CASE WHEN ConversionCount > 0 THEN 1 END) AS ConversionCount
FROM (
    SELECT
      EOMONTH(gj.TransactionDate) AS mth,
      gj.TransactionId,
      COUNT(CASE WHEN gj.MasterRecord = 1 AND gj.TransactionTypeId = @SalesEstimateTransactionTypeId THEN 1 END) AS TransactionCount,
      COUNT(CASE WHEN gj.MasterRecord = 0 AND gj.TransactionTypeId = @SalesOrderTransactionTypeId AND gj.SEReferenceId > 0 THEN 1 END) AS ConversionCount
    FROM GeneralJournal gj
    WHERE gj.TransactionDate >= @StartDate
      AND gj.TransactionDate <  @EndDate
      AND gj.TransactionTypeId IN (@SalesOrderTransactionTypeId, @SalesEstimateTransactionTypeId)
    GROUP BY
      EOMONTH(gj.TransactionDate),
      TransactionId
) g
GROUP BY
  mth;
Charlieface
  • 52,284
  • 6
  • 19
  • 43
1

Your second query that is close, I think just has a couple of minor omissions.

  1. You forgot MasterRecord = 0 in your ConversionCount CASE statement.
  2. Instead of returning 1 or 0 from your ConversionCount CASE you should return TransactionID or NULL so you can still count distinct values.
  3. You are missing DISTINCT inside of your ConversionCount COUNT.
  4. You will need to handle NULL values in the ConversionCount COUNT. I assumed you will always have at one or more NULLs, so I just subtract 1 from the COUNT(DISTINCT ...) to compensate.

(I can't be 100% on the syntax here without some example detail data to work with.)

Code

SELECT
    ReportingMonth      = DATENAME(mm, GeneralJournal.TransactionDate),
    MonthNumber         = DATEPART(mm, GeneralJournal.TransactionDate),
    ReportingYear       = DATEPART(yyyy, GeneralJournal.TransactionDate),
    TransactionCount    = SUM(CASE
                                WHEN TransactionTypeId = @SalesEstimateTransactionTypeId
                                    AND MasterRecord = 1 THEN
                                    1
                                ELSE
                                    0
                            END
                        ),
    ConversionCount     = COUNT(DISTINCT CASE
                                WHEN TransactionTypeId = @SalesOrderTransactionTypeId
                                    AND SEReferenceId > 0
                                    AND MasterRecord = 0 THEN
                                    TransactionID
                                ELSE
                                    NULL
                            END
                        ) - 1 /* Subtract 1 for the NULL */
FROM    GeneralJournal
WHERE
    GeneralJournal.TransactionDate >= @StartDate
    AND GeneralJournal.TransactionDate <= @EndDate
    AND TransactionTypeId IN (
            @SalesOrderTransactionTypeId,
            @SalesEstimateTransactionTypeId
        )
GROUP BY
    DATEPART(yyyy, GeneralJournal.TransactionDate),
    DATEPART(mm, GeneralJournal.TransactionDate),
    DATENAME(mm, GeneralJournal.TransactionDate);
Nick Fotopoulos
  • 531
  • 5
  • 15
  • 1
    Hi Nick, thank you for this answer. You spotted my errors, and I learned something new as I have never before seen the alias = expression syntax and always used expression AS alias. The only change I needed to make was to remove the null and corresponding deduction. Since we are only counting the ones that match, in every date range a tried the values were returned one less than they should have been. I have put it back into the code I already have that allows for months with no transactions and still returns zero, and all appears perfect. Thank you so much for your great answer. – oldcoder Mar 19 '22 at 15:02
  • I prefer the ColumnName = format because it's keep your column aliases easily readable and lined up when you have complicated calculated column. – Nick Fotopoulos Mar 21 '22 at 12:49