1

I have two tables in the following structure

Table - MemoType

ID |  MemoTypeID | MemoTypeName
1     1234         A
2     5678         B

Table - Memos

ID | MemoTypeID | Memo     | ExtRef
1    1234         TextOne    XYZ
2    5678         TextTwo    XYZ
3    1234         TextThree  TUV

We would like to update these tables to reflect the following data

Table - MemoType

ID |  MemoTypeID | MemoTypeName
3     9999         NewCombinedMemo

Table - Memos

ID | MemoTypeID | Memo                    | ExtRef
4    9999         <A> TextOne <B> TextTwo   XYZ
5    9999         <A> TextThree             TUV

The memos table has about 2 million rows with about 200,000 unique values for ExtRef.

My thinking is along the following lines (using .NET): Populate a List of all unique ExtRef values from Memos table; For each unique ExtRef get a list of all Memo values; concatenate strings as required; insert new record for each ExtRef; delete rest of the records for each ExtRef. The problem is that this would result in a large number of sql operations.

Please suggest if there are other efficient strategies to achieve this directly in SQL.

swiftcode
  • 195
  • 9
  • Instead of getting a lit of unique `ExtRef` and then looking up the strings for that, do a `GroupBy(x=>x.ExtRef)`. – Tim S. Jun 25 '13 at 20:26

1 Answers1

1

This is indeed possible directly through SQL, the following creates table variables to demonstrate / test with sample data and doesn't delete the original data.

The orginal data could easily be deleted using a clause checking on the memo type id, but I'd want to hold off on that until I'd performed a manual check on such a large table!

-- setting the scene
DECLARE @MemoType TABLE
(
    Id int,
    MemoTypeId int, 
    MemoTypeName varchar(30)
)

DECLARE @Memo TABLE
(
    Id int identity(1,1),
    MemoTypeId int, 
    Memo varchar(500),
    ExtRef varchar(1000)
)

INSERT INTO @MemoType VALUES (1,1234,'A');
INSERT INTO @MemoType VALUES (2,1234,'B');
INSERT INTO @MemoType VALUES (3,9999,'NewCombinedMemo');

INSERT INTO @Memo VALUES (1234, 'TextOne', 'XYZ');
INSERT INTO @Memo VALUES (5678, 'TextTwo', 'XYZ');
INSERT INTO @Memo VALUES (1234, 'TextThree', 'TUV');



WITH cte(id, memotype, memotext, ref) as (
   SELECT Id, MemoTypeId, Memo, ExtRef FROM @Memo
)
INSERT INTO @memo
SELECT 9999, stuff(memos,1,1,''),ref 
FROM cte [outer]
CROSS APPLY (
    SELECT ',' + memotext 
    FROM cte [inner] 
    WHERE [outer].ref = [inner].ref 
    FOR XML PATH('')
) n(memos)
GROUP BY ref, memos

select * from @memo

The CTE logic/description was borrowed from string concatenate in group by function with other aggregate functions - adding in logic to insert and strip out the leading comma.

  • I placed your original query in a CTE.
  • Then I cross applied with a subquery that gets a comma-delimited set of memos for each reference in the outer query.
  • Since I also selected the memos column, I had to also group by the memos column.
  • An initial comma needed to be stripped out with the stuff function
  • Finally, the result is inserted.
Community
  • 1
  • 1
Justin Crabtree
  • 327
  • 1
  • 6