0

I'm trying to edit an existing process we have that is called from SQL Server 2008r2 via xp_cmdshell. It replaces/removes certain characters in a csv file before it is then imported. I've managed to add in removing fullstops but am having trouble getting it to remove backticks. Every way I have tried to escape it I get the error sed: -e expression #1, char 97: Unterminated `s' command.

In SQL:

DECLARE 
    @FileLocation VARCHAR(255)          = '\\FILELOCATION\'
    ,@ArchiveLocation VARCHAR(255)      = '\\FILELOCATION\Archive\'
    ,@FileName VARCHAR(50)              = 'DATE_FILENAME.csv'
    ,@UtilitiesLocation VARCHAR(100)    = '\\NETWORKLOCATION\UTILITIES\'
    ,@Command           VARCHAR(2000)

DECLARE @SedCommand     VARCHAR(100)    = 
      's/£//g;'     -- remove pound signs
    + 's/\.//g;'    -- remove full stop
    + 's/\`//g;'    -- remove grave accent (backtick)
    + 's/\"\([A-Za-z- '']*\),\([A-Za-z- '']*\)\"/\1|\2/g;'  -- "Last, First" replaced by Last|First
    + 's/\"\([0-9.-]*\),\([0-9.-]*\)\"/\1\2/g;'             -- remove thousands separators in numbers

SET @Command = 
      'copy '
    + @FileLocation + @FileName + ' '
    + @ArchiveLocation + 'QN' + @FileName
PRINT @Command -- copy \\FILELOCATION\DATE_FILENAME.csv \\FILELOCATION\Archive\QNDATE_FILENAME.csv

EXEC xp_cmdshell @Command

SET @Command = 
      @UtilitiesLocation + 'sed --text -e ' 
    + '"' + @SedCommand + '" '
    + @ArchiveLocation + 'QN' + @FileName
    + ' > '
    + @FileLocation + @FileName
PRINT @Command -- \\NETWORKLOCATION\UTILITIES\sed --text -e "s/£//g;s/\.//g;s/\`//g;s/\"\([A-Za-z- ']*\),\([A-Za-z- ']*\)\"/\1|\2/g;s/\"\([0-9.-]*\),\([0-9.-]*\)" \\FILELOCATION\Archive\QNDATE_FILENAME.csv > \\FILELOCATION\DATE_FILENAME.csv

EXEC xp_cmdshell @Command

Is anyone able to help please? I've tried using 3 backslashes, ^ as the escape character, putting the backtick in double quotes but can't get anything to work.

PChopra
  • 13
  • 3
  • does a simpler `sed` command work (without back-tics)? Also, my experience says you may need as many as 5 escape chars, but I can imagine cases where it would be more. Set up a test just for that (without all the other stuff) to understand the issue. AND, if you can't run `sed` on the file before you load it, can't you make a work table, and clean up your data without an external call? OR make a permanent external file for your sed cmd and then just call `sed -f script.sed -i data.txt`? Good luck. – shellter Oct 18 '16 at 12:20
  • @shellter The sed command works without the backtick part, that's the part I'm trying to add. I get the same error if I remove everything else leaving just the backtick removal. I need to run it before the import as this is done via bcp and the column mappings aren't working and the data is ending up in the wrong columns when a backtick is encountered. 5 backslashes resulted in the same error. – PChopra Oct 18 '16 at 12:39
  • still think it would be best to process file outside of SQL. But if you can create a pipeline and call it, then `tr -d 'X' file | sed '...' > outfile` could help you. (where X=backtic of course). That is to say, use the `tr` cmd to delete that char (and you could delete any others that don't require a reg-ex for context). Else, keep adding backslashes? OR create a load table that is just just one column for the whole record, do your subs there and rely on clean field-separators to parse into the production tables? Good luck. – shellter Oct 18 '16 at 13:08

0 Answers0