1

I am new to Azure Data Factory, and I have searched everywhere for a solution that may be implemented for my necessity, but I haven't found any.

My Problem: I have a table in Azure Database with a column containing a mathematical operation, about 50 columns containing the variables for the operation and one last column where I need to update the result of the mathematical operation, like this: Example of the table

What I want to do is to fill up the column "result" with the result of the mathematical operation, contained in the column "Operation", using the other columns values in the expression. This is just an example table, my actual table has about 50 columns of values, so it is not a solution for me to use a "replace" operation.

Bribous
  • 13
  • 2

1 Answers1

1

There are probably a few ways to do this but I would not use Data Factory, unless you need to orchestrate this activity as part of a wider pipeline. As you have some compute handy via Azure SQL Database, I would make best use of that unless you have a specific reason not to do so. T-SQL has dynamic SQL and the EXEC command to help. Use a cursor to run through the distinct list of formulas and execute it dynamically. A simplified example:

DROP TABLE IF EXISTS dbo.formulas;

CREATE TABLE dbo.formulas (
    Id          INT PRIMARY KEY,
    formula     VARCHAR(100) NOT NULL,
    a           INT NOT NULL,
    b           INT NOT NULL,
    c           INT NOT NULL,
    d           INT NOT NULL,
    e           INT NOT NULL,

    --...

    result      INT

);

-- Set up test data
INSERT INTO dbo.formulas ( Id, formula, a, b, c, d, e )
VALUES
    ( 1, '(a+b)/d', 1, 20, 2, 3, 1 ),
    ( 2, '(c+b)*(a+e)', 0, 1, 2, 3, 4 ),
    ( 3, 'a*(d+e+c)', 7, 10, 6, 2, 1 )


SET NOCOUNT ON 

-- Create local fast_forward ( forward-only, read-only ) cursor 
-- Get the distinct formulas for the table
DECLARE formulaCursor CURSOR FAST_FORWARD LOCAL FOR 
SELECT DISTINCT formula
FROM dbo.formulas

-- Cursor variables
DECLARE @sql        NVARCHAR(MAX)
DECLARE @formula    NVARCHAR(100)

OPEN formulaCursor

FETCH NEXT FROM formulaCursor INTO @formula
WHILE @@fetch_status = 0
BEGIN

    SET @sql = 'UPDATE dbo.formulas
SET result = ' + @formula + '
--OUTPUT inserted.id        -- optionally output updated ids
WHERE formula = ''' + @formula + ''';'

    PRINT @sql

    -- Update each result field for the current formula
    EXEC(@sql)

    FETCH NEXT FROM formulaCursor INTO @formula
END

CLOSE formulaCursor
DEALLOCATE formulaCursor
GO

SET NOCOUNT OFF
GO


-- Check the results
SELECT *
FROM dbo.formulas;

Cursors have a bad reputation for performance but i) here I'm using the distinct list of formulas and ii) sometimes it's the only way. I can't think of a nice set-based way of doing this - happy to be corrected. CLR is not available to you. If performance is a major issue for you you may need to think about alternatives; there's an interesting discussion on a similar problem here.

My results:

My results

If your database was an Azure Synapse Analytics dedicated SQL pool then you could look at Azure Synapse Notebooks to achieve the same outcome.

wBob
  • 13,710
  • 3
  • 20
  • 37
  • Thanks @wBob, this part is only a piece of the whole pipeline that I need to orchestrate, since it has only a few rows for each execution I wanted prioritize using Azure Data Factory Activities, but also, if there isn't any other way, I'll use what I have. This is one of the strategies that I was thinking of using, maybe using While instead of Cursors, but as I said, its only a few rows, like 300. – Bribous Mar 16 '21 at 02:24
  • Wrap the cursor in a stored proc to call via the Stored Proc activity from ADF. A While loop is just another loop so no real benefit from changing from one to another. – wBob Mar 16 '21 at 10:42