2

I have a flat text file that is headerless (I have the layout of fields), comma delimited, that is also quoted identified. The file unfortunately contains >3000 characters of whitespace immediately following the final field in the record set. This is true for all records.

Example record:

"Stuff","Things","No","","000000000.00","000000000.00"," " <-(>3000 whitespace)

I'm currently just trying to get the data into a disk table in SQL Server via SSMS import file wizard just to start profiling the data to ensure my final table is type/sized accordingly. Here is what the preview pane looks like:

enter image description here

Even if I explicitly ignore the fields it wants to apply for the final field and the white space, the import fails on the final field, claiming it couldn't be found.

I've already requested a cleaner file, however I would like to know what to do in the future as a work around while waiting for a cleaner file.

I am open to an SSIS solution, however an SSMS import solution would also be appreciated for just getting data from a file such as this into a temp table to play with the data.

Thanks

jarlh
  • 42,561
  • 8
  • 45
  • 63
  • Why not pre-process the file to clean it up before using the import wizard? A simple one-line `sed` command will do it: `s/\s+$\\`. – rd_nielsen Oct 24 '18 at 02:16
  • Is this a strictly unix command, or is there a cmd or powershell equivalent? – JackalSnacks Oct 25 '18 at 23:08
  • It is originally a *nix command, but there are versions compiled for Windows, for example by the GnuWin32 project. – rd_nielsen Oct 25 '18 at 23:15
  • I've had success with using a powershell script here: https://stackoverflow.com/questions/3579157/how-to-use-powershell-to-remove-space-from-the-end-of-a-line-in-a-text-file. I thank you rd_nielsen for getting me there! – JackalSnacks Oct 25 '18 at 23:53

1 Answers1

0

Using SSIS, the trailing white space can be removed by adding a Derived Column with the RTRIM function. When you add the Derived Column in the Data Flow, select the "Replace" option for your column and apply this function to it. If you need to remove both leading and trailing white space the TRIM function will handle this.

RTRIM(YourColumnName)
userfl89
  • 4,610
  • 1
  • 9
  • 17
  • Can you please lead me into the right direction on this? I have the Flat file source now in the data flow, this is connected to a derived column transform. The bad column is the last in the array, of which specifically is named "Column 63". I have placed column 63 into the derived column transform and used Replace, and using the RTRIM under the expression RTRIM([cColumn 63]). It still seems to break with: [Flat File Source [273]] Error: The column delimiter for column "Column 63" was not found. – JackalSnacks Oct 25 '18 at 23:06
  • You're using a comma as the delimiter, right? Is this specified in the flat file connection manager? Also, is the text qualifier set to quotes? – userfl89 Oct 26 '18 at 13:21
  • Please refer to my samples in the question. The data is comma delimited and text qualified in quotes. The flat file source is configured properly, hence why the preview pane displays fields correctly (except for the last field). the sample data from the file shows that the white space occurs right after the final field quote (,"000000000.00"," " -now begin white space-). This is why the final field in the preview pane looks like: [" " ---]. The white space, i believe, is causing confusion within SSIS and perceiving the added whitespace at the end as an incomplete field definition – JackalSnacks Oct 26 '18 at 21:19
  • The goal is to remove the whitespace post last field. Currently, the pre-processing of the file with a powershell script proposed by rd_nielsen seems to be the way to go. However, it would be nice for an SSIS task solution to remove the whitespace within the source component. However, I am doubting it's existence. – JackalSnacks Oct 26 '18 at 21:21
  • Perhaps I misunderstood your question. The white space following the last field is being interpreted as an additional column in SSIS? If so, you can open the flat file connection manager and delete the last (invalid) column from there. – userfl89 Oct 29 '18 at 13:27