0

I have received a csv file from a ftp server which I am ingesting into a table. While ingesting the file I am receiving the error "File was a truncated file"

The actual reason is the data in a file contains $ and ^M$ in end of the line. e.g :

ACT_RUN_TM, PROG_RUN_TM, US_HE_DT*^M$* "CONFIRMED","","3600"$

How can I remove these $ and ^M$ from end of the line using linux command.

Charles
  • 50,943
  • 13
  • 104
  • 142
Arun Padule
  • 681
  • 1
  • 6
  • 7

2 Answers2

3

The ultimately correct solution is to transfer the file from the FTP server in text mode rather than binary mode, which does the appropriate end-of-line conversion for you. Change your download scripts or FTP application configuration to enable text transfers to fix this in future.

Assuming this is a one-shot transfer and you have already downloaded the file and just want to fix it, you can use tr(1) to translate characters. So to remove all control-M characters from a file, you can pipe through tr -d '\r'. Or if you want to replace them with control-J instead – for example you would do this if the file came from a pre-OSX Mac system — do tr '\r' '\n'.

pndc
  • 3,710
  • 2
  • 23
  • 34
0

It's odd to see ^M as not-the-last character, but:

sed -e 's/^M*\$$//g' <badfile >goodfile

Or use "sed -i" to update in-place. (Note that "^M" is entered on the command line by pressing CTRL-V CTRL_M).

Update: It's been established that the question is wrong as the "^M$" are not in the file but displayed with VI. He actually wants to change CRLF pairs to just LF.

sed -e 's/^M$//g' <badfile >goodfile

Brian White
  • 8,332
  • 2
  • 43
  • 67
  • sed command is not able to delete the ^M$ and $ from the end of the line ( Note : ^M$ and $ is only visible when I set :set list in vi editor – Arun Padule Oct 23 '12 at 13:53
  • @ArunPadule, it worked just fine for me on my command-line. Note the "*" between the "^M" and "\$" to match it optionally. It doesn't have to be visible to work. Pipe the output of `sed` into `cat -v` to see special characters. – Brian White Oct 23 '12 at 13:58
  • OH! Those characters are **not** in your file. They're just being displayed. Let me update the answer... – Brian White Oct 23 '12 at 14:00
  • sed command is not able to delete the ^M$ and $ from the end of the line Below is the sample data : "a","b"^M$ "d","e"$ ( Note : ^M$ and $ is only visible when I set :set list in vi editor ) – Arun Padule Oct 23 '12 at 14:05
  • we also tried replacing carraige return by break using below command s/\r\r\n/
    /g. But it is also not working.
    – Arun Padule Oct 23 '12 at 14:13
  • The issue is resolved when we open the csv file on window and save it, as it is. And then copy the same file to linux server , then the ^M$ and $ are disappeared from the end of line , even in :set list mode and I am able to ingest the file. But How can I do these on linux server itself. I need to automate the process and dont need to do it manually – Arun Padule Oct 23 '12 at 14:17
  • Forget VI. Use `cat -v` so you know exactly what is in the file versus what some editor displaying. There shouldn't be two `\r` together, just one, and the `\n` is managed by "sed" as the line separator so isn't available to match. `s/\r//g` – Brian White Oct 23 '12 at 15:41