0

I have a table that I need to do an initial update on and then a nightly update on. I have a file name that I need to get values from inorder to update two columns. I come from an Oracle PL/SQL background, and have little experience in SQL Server and T-SQL. I am good with Oracle cursors to traverse tables and understand they are controversial for doing stuff in SQL Server.

My ultimate goal is to provide the users with information on how many of each file type and Water Right type have been processed.

File names look like this:

UPDATE_TO_FILE_STATEMENT_OF_CLAIM_41L_44799_00_2780.pdf
PRELIMINARY_DECREE_STATEMENT_OF_CLAIM_41G_40643_00_1134.pdf
FILE_STOCKWATER_PERMIT_40N_30116114_10437.pdf
UPDATE_TO_FILE_GROUND_WATER_CERTIFICATE_40E_30025354_10399.pdf

From this file name, I need to pull the file type and the Water Right type. There are 4 file types and 20 Water Right types. I could use a Case, but then I have to change the code if they add a file type or a Water Right type. Parsing the names is no easy task either. I have created a table with the codes I will use, what the file name looks like, the description, and whether or not it is a File Type or Water Right Type.

FILE_CD FILE_NAME                       DESCR                       FILE_TYPE
----------------------------------------------------------------------------
62GW    62-73_GROUND_WATER_RECORD       62-73 GROUND WATER RECORD       WT
CDWR    CONSERVATION_DISTRICT_RECORD    CONSERVATION DISTRICT RECORD    WT
UPDT    UPDATE_TO_FILE_                 UPDATE TO FILE                  FT

My plan is to traverse this table and run updates for the initial run. I will also traverse this table and do a single update in a trigger for each record added each night.

This is my code:

BEGIN TRANSACTION

DECLARE @F_type NVARCHAR(2)
DECLARE @F_name NVARCHAR(30)
DECLARE @F_CD NVARCHAR(4)

-- local means the cursor name is private to this code
-- fast_forward enables some speed optimizations
DECLARE c_TUpdt CURSOR LOCAL FAST_FORWARD FOR
   SELECT 
       FILE_CD, FILE_NAME, FILE_TYPE
   FROM 
       PRDECMSTATS.dbo.FileType 
   WHERE 
       FILE_CD NOT IN ('OTHT', 'OTHF')

OPEN c_TUpdt

FETCH NEXT FROM c_TUpdt INTO @F_CD, @F_name, @F_type

WHILE @@fetch_status = 0
BEGIN
    -- CHECK FILE TYPE
    IF @F_type = 'FT' THEN
       UPDATE PRDECMSTATS.dbo.FileDetails 
       SET File_Type = @F_CD
       WHERE File_Type IS NULL
         AND FileNAME LIKE @F_name || '%'
    END IF

    IF @F_type = 'WT' THEN
       UPDATE PRDECMSTATS.dbo.FileDetails 
       SET WR_Type = @F_CD
       WHERE WR_Type IS NULL
         AND FileNAME LIKE '%' || @F_name || '%'
    END IF

    FETCH NEXT FROM c_TUpdt INTO @F_CD, @F_name, @F_type
END

CLOSE c_TUpdt
DEALLOCATE c_TUpdt

/* need to then traverse for other types, these will need to be corrected by hand at some point */
UPDATE PRDECMSTATS.dbo.FileDetails 
SET File_Type = 'OTHF'
WHERE File_Type IS NULL

UPDATE PRDECMSTATS.dbo.FileDetails 
SET WR_Type = 'OTHT'
WHERE WR_Type IS NULL

COMMIT TRANSACTION

I expect this to update the 2 columns given the data in the file name.

WILL THIS WORK? Is there a better way to do this?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
SnowXTC
  • 19
  • 5

2 Answers2

2

Cursors can work, but are almost never the most efficient answer. I believe this is a complete replacement for your cursor

UPDATE fd
SET File_Type = ft.FILE_CD
FROM PRDECMSTATS.dbo.FileDetails fd
INNER JOIN PRDECMSTATS.dbo.FileType ft ON ft.FILE_CD NOT IN ('OTHT', 'OTHF')
    AND ft.FILE_TYPE='FT'
    AND fd.File_Type IS NULL
    AND fd.FileNAME LIKE ft.FILE_NAME + '%'

UPDATE fd
SET WR_Type = ft.FILE_CD
FROM PRDECMSTATS.dbo.FileDetails fd
INNER JOIN PRDECMSTATS.dbo.FileType ft ON ft.FILE_CD NOT IN ('OTHT', 'OTHF')
    AND ft.FILE_TYPE='WT'
    AND fd.WR_Type IS NULL
    AND fd.FileNAME LIKE  '%' + ft.FILE_NAME + '%'

It could be condensed into a single statement, but I think it would make it far less legible. By the way, the '+' operator concatenates in SQL Server. Also, the first UPDATE doesn't have a wildcard at the front of the string like the second one; not sure if that is intentional.

Expanding beyond the cursor in question, this could replace everything

UPDATE fd
SET File_Type = ISNULL(ft.FILE_CD,'OTHF')
FROM PRDECMSTATS.dbo.FileDetails fd
LEFT JOIN PRDECMSTATS.dbo.FileType ft ON ft.FILE_CD NOT IN ('OTHT', 'OTHF')
    AND ft.FILE_TYPE='FT'
    AND fd.FileNAME LIKE ft.FILE_NAME + '%'
WHERE fd.File_Type IS NULL

UPDATE fd
SET WR_Type = ISNULL(ft.FILE_CD,'OTHT')
FROM PRDECMSTATS.dbo.FileDetails fd
LEFT JOIN PRDECMSTATS.dbo.FileType ft ON ft.FILE_CD NOT IN ('OTHT', 'OTHF')
    AND ft.FILE_TYPE='WT'
    AND fd.FileNAME LIKE  '%' + ft.FILE_NAME + '%'
WHERE fd.WR_Type IS NULL
UnhandledExcepSean
  • 12,504
  • 2
  • 35
  • 51
  • The wildcard was intentional as the filename will always start with the File Type, thus no need for the wildcard. This is really awesome. While cursors work great in Oracle, they are kind of cumbersome in MS SQL. It is great to see other options and ways of doing things. Thank you. – SnowXTC Jan 04 '19 at 18:58
  • Interesting, the first option set all the file types to OTHF and the second option did nothing. – SnowXTC Jan 11 '19 at 21:50
  • Ok, this needed RTRIM on the file name too. I like the single pass answer, but it will only work for the WR_type and not the File_Type. The single pass is also noticeably faster. – SnowXTC Jan 11 '19 at 23:06
1

Sure, you could use a cursor for this, but you can do all of this with a single pass of the data (much, much, much more efficient). Compare taking one trip to the grocery store to get everything vs. going to the store to get milk, driving home, returning to the store to get bread, driving home, and so on.

;WITH ft AS
(
  SELECT FILE_CD, FILE_TYPE, 
    Pattern = CASE FILE_TYPE WHEN 'WT' THEN '%' ELSE '' END + FILE_NAME + '%'
  FROM PRDECMSTATS.dbo.FileType
  WHERE FILE_CD NOT IN ('OTHT','OTHF')
)
UPDATE f SET 
  FILE_TYPE = CASE WHEN ft.FILE_TYPE = 'FT' THEN ft.FILE_CD ELSE 'OTHF' END,
  WR_TYPE = CASE WHEN ft.FILE_TYPE = 'WT' THEN ft.FILE_CD ELSE 'OTHT' END
FROM PRDECMSTATS.dbo.FileDetails AS f
LEFT OUTER JOIN ft -- changed this
ON f.FileNAME LIKE ft.pattern;

Now, your question has this logic:

UPDATE PRDECMSTATS.dbo.FileDetails 
SET File_Type = 'OTHF'
WHERE File_Type IS NULL

UPDATE PRDECMSTATS.dbo.FileDetails 
SET WR_Type = 'OTHT'
WHERE WR_Type IS NULL

You complained that I am setting File_Type to OTHF, but that's exactly what you were already doing! Here is the result I get:

enter image description here

With the sample data you provided, and the logic you wrote in your pseudo-cursor, I don't know how you could ever get:

  • any value other than OTHT in WR_Type (since no file you provided in the question match the pattern for WT)
  • OTHT in File_Type at all (neither your cursor nor my code can do that)
  • OTHF in WR_Type at all (neither your cursor nor my code can do that either)

Maybe start over with just a full set of sample data (e.g. also include at least one file that matches a WT pattern), and desired results, and don't show us the cursor in the middle, because I think you have logic errors in there that are driving assumptions in the reader and are doubling back to confuse you.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • This affected 0 rows. I am going to have to look at this some more. – SnowXTC Jan 11 '19 at 21:51
  • I definitely need to RTRIM the file name. This gave me the WR_TYPE perfectly, but for some reason the FILE_TYPE is still returning OTHF for every record. – SnowXTC Jan 11 '19 at 22:31
  • Maybe the data in your table is not similar to the data in your question. – Aaron Bertrand Jan 12 '19 at 00:06
  • ...or you didn't copy the code correctly. It's quite possible you put `OTHF` in *both* CASE expressions. – Aaron Bertrand Jan 12 '19 at 00:39
  • I copied the code exactly and modified only the table name and added the RTRIM. If it updates File Type, it sets the WR Type to OTHT, if it updates WR Type it sets the File Type to OTHF. It will not properly update both. – SnowXTC Jan 18 '19 at 21:48
  • Here is the code. Which won't format. ;WITH ft AS ( SELECT FILE_CD, FILE_TYPE, Pattern = CASE FILE_TYPE WHEN 'WT' THEN '%' ELSE '' END + RTRIM(FILE_NAME) + '%' FROM PRDECMSTATS.dbo.FileTypes WHERE FILE_CD NOT IN ('OTHT','OTHF') ) UPDATE f SET FILE_TYPE = CASE WHEN ft.FILE_TYPE = 'FT' THEN ft.FILE_CD ELSE 'OTHF' END, WR_TYPE = CASE WHEN ft.FILE_TYPE = 'WT' THEN ft.FILE_CD ELSE 'OTHT' END FROM PRDECMSTATS.dbo.FileDetails AS f INNER JOIN ft ON f.FileNAME LIKE ft.pattern; – SnowXTC Jan 18 '19 at 21:51
  • I totally understand why it does it, but my lack of knowledge in MS SQL keeps me from knowing a solution to fix it.
    If File Type is FT, it will enter the proper file type, but because the file type is FT, WR Type is set to OTHF. It all depends on whether or not it hits the FT or the WT first and it will not go back and correct the other field once the record is updated.
    – SnowXTC Jan 18 '19 at 22:16
  • Look, your sample data only has two files that would be affected in the same way, and no files that would be affected the other way, so it's really hard to determine what is actually required here. Also, your question says `File_Type='OTHT'/WR_Type='OTHF'` and now you say you want the opposite. Yes, you'll need `RTRIM(FILE_NAME)` if the data type in `FileType` is `char` or it has trailing spaces. You should handle that at the table or when the data is populated, not in every single query you run against it later. Anyway, try a LEFT JOIN instead of INNER. You can sort out OTHT/OTHF on your own. – Aaron Bertrand Jan 18 '19 at 23:14