4

I have an OLEDB (SQL) data flow source (A) that pulls a result set from a stored procedure and throws the results into an OLEDB (Oracle) data flow destination (B).

Is there a way to capture an aggregate value from the dataset into a variable, all within the data flow task? Specifically, I'd want to capture the MAX(<DateValue>) from the entire dataset.

dataflowwithcapture

Otherwise, I'd have to pull the same data twice in a different data flow task, whether I point to A or in its new location, B.

EDIT: I already know how to do this in the Control Flow from an Execute SQL task. I'm asking because I'm curious to know if I can get this done in the Data Flow task since I'm already collecting the data there. Is there a way to grab an aggregate value in the Data Flow?

dev_etter
  • 1,156
  • 13
  • 32
  • 1
    Sure thing! You could add a conditional split in between of have a script that does just what you need. But without data structure, I can't give you that script. – WickedFan May 29 '14 at 22:17
  • 1
    I suggest that you use an execute sql task on the top to get the aggregate. Store the aggregate in a variable. Simple. Btw, why do you say that you are a former software dev ? You still are one even now :) – Erran Morad May 29 '14 at 22:37
  • 1
    @WickedFan How would a conditional split do this? – Martin Smith May 30 '14 at 13:46

2 Answers2

6

One way of doing it would be to add a multicast transform between the source and destination that also feeds into a script component.

Whilst an aggregate transform would also work this method avoids adding a blocking transform

enter image description here

Configure the Script Component as a destination, give it read/write access to the variable and then edit the script to be something like

//Instance level variable
DateTime? maxDate = null;

public override void PostExecute()
{
    base.PostExecute();

    if (maxDate.HasValue)
    {
        this.Variables.MaxDate = maxDate.Value;
    }

    System.Windows.Forms.MessageBox.Show(this.Variables.MaxDate.ToString());
}


public override void Input0_ProcessInputRow(Input0Buffer Row)
{
    if (!Row.createdate_IsNull)
    {
        maxDate = Row.createdate < maxDate ? maxDate : Row.createdate;
    }
}
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • This works great and is exactly what I'm looking for. And thank you for suggesting an alternative to using an Aggregate Transform. – dev_etter May 30 '14 at 19:53
0
  • U keep your current DFT as such in the control Flow (source to destination mapping as such)
  • In the control flow, add an Exceute SQL task, With the same source query with your desired MAX() function applied on it.

enter image description here

Eg: 
--Let the given be Your source query.
SELECT  ColumnA,
        ColumnB,
        ColumnC,
        DateValue
FROM    SourceA

--Your new query to calculate MAX() may be this.
SELECT  MAX(DateValue)
FROM    SourceA

Give the 2nd SQL in the execute SQL task. In the package Add a variable of type int, in package level scope. (eg: name = intMax)

In the Execute SQL task, not the following.
    a.general Tab
        Result Set  = Single Row
        Sql Statement = SELECT  MAX(DateValue) FROM SourceA
    b.result set Tab
        click ADD
        ResultName = 0
        variable Name = variable Name (eg: name = intMax)

enter image description here

Your required result will be available in the variable from here onwards.

Jithin Shaji
  • 5,893
  • 5
  • 25
  • 47
  • 1
    This isn't what I'm asking for. I've done this in the control flow before, many times. I want to know if it's possible to do from the Data Flow. – dev_etter May 30 '14 at 12:56
  • ok, In SSIS, data flows in piplelines, so different values, flow through pipeline at different instances.. Think this may help you. 1. Add a LookUp task after your Source. 2. In Look up task, give the SQL to get the max value. 3. Now we have two outputs for Look Up task, Matching and Non-matching. The matching pipe will give the Max(). – Jithin Shaji May 30 '14 at 15:24