2

I am trying to remove carriage return from .tsv file using batch file. This is how my .tsv file looks, the first line is a column line

* **[CR] and [LF] shown in these lines are manually added to get an idea

Class   Name & Address  Item    lbs Value   Pickup/Drop off Date:   23 Sep[CR][LF]
Class1  Ben Coha[CR]
 2305 E LA st[CR]
 VISIA, PA[CR]
 932112-4422    Health and beauty product / cologne body wear for men   0.13    19[CR][LF]      
Class2  mich marce[CR]
 255 rid court[CR]
 prince frick, PA[CR]
 20442  health and beauty product / cologne body wear for women 1.5 47  

I want this file as below [I used notepad to remove('replace' by nothing) occurrences of [CR] only]

Class   Name & Address  Item    lbs Value   Pickup/Drop off Date:   23 Sep[LF]
Class1  Ben Coha 2305 E LA st VISIA, PA 932112-4422 Health and beauty product / cologne body wear for men   0.13    19[LF]      
Class2  mic marce 255 rid court prince frick, PA 20442  health and beauty product / cologne body wear for women 1.5 47  

I tried following batch file. file is being put in one single line. It removes both carriage return and line feed.

@echo off
SetLocal DisableDelayedExpansion
for /f "delims=" %%a in (myFile.tsv) do (
echo/|set /p ="%%a%"
)>>newMyFile.tsv    

The result looks like..

Class   Name & Address  Item    lbs Value   Pickup/Drop off Date:   23 SepClass1    Ben Coha 2305 E LA st VISIA, PA 932112-4422 Health and beauty product / cologne body wear for men   0.13    19      Class2  mic marce 255 rid court prince frick, PA 20442  health and beauty product / cologne body wear for women 1.5 47  

I want to modify the .bat file so that it only removes \r instead of removing both \r\n

Update: Somehow I am able to add images, this will give clearer idea. Similar .tsv file enter image description here want it to be like this enter image description here

Vik
  • 89
  • 2
  • 13
  • You'll have to use a vbscript/powershell for that (it's still can be embedded in a batch file). For example, use [this code](http://stackoverflow.com/a/2975775/3959875) to replace `vbCr` with `" "` and then `" "+vbLf` with `vbCrLf`. – wOxxOm Sep 30 '15 at 15:14
  • Are the records correct? Record one has a space at the start and record two doesn't have a space. Can you provide a few more records or confirm that records don't necessarily start in column 1? – foxidrive Oct 01 '15 at 05:15
  • It was a typo while posting, record one has no space. I corrected it. – Vik Oct 01 '15 at 11:47
  • Ahh :-) all is finally clear now that you incorporated [CR] and [LF] into the listed files. – dbenham Oct 01 '15 at 18:38
  • Apologize for the miscommunication on my part, I guess the question is now more clear. – Vik Oct 01 '15 at 18:41

3 Answers3

3

This is trivial if you use the JREPL.BAT regular expression text processing utility

jrepl \r "" /f "myFile.tsv" /o "newMyFile.tsv"

You can overwrite the original file if you use /o -.

You must use CALL JREPL if you put the command within a batch script.


Below is my original answer for when I thought there was a CR/LF at the end of each line in the source file (before the question was edited).

I hate editing text files with batch, as it requires a lot of arcane knowledge, has many restrictions, and the result is slow. However, it is possible to solve this with batch, and I decided I could use the practice :-)

The following works provided that each input line is <= 1021 bytes long, and the output lines are all < ~8191 bytes long.

@echo off
setlocal enableDelayedExpansion
set "input=test.txt"
set "output=out.txt"

:: Define LF to contain a linefeed character
set ^"LF=^

^"  The empty line above is critical - DO NOT REMOVE

:: Determine how many sets of 4 lines must be read
for /f %%N in ('find /c /v "" ^<"test.txt"') do set /a cnt=%%N/4

<"!input!" >"!output!" (

  %= Read and write the first line =%
  set "ln="
  set /p "ln="
  <nul set /p "=!ln!!LF!"

  %= Outer loop iterates the 4 line sets =%
  for /l %%N in (1 1 !cnt!) do (

    %= Initialize out line to empty =%
    set "out="

    %= Inner loop appends next 4 lines into out =%
    for /l %%n in (1 1 4) do (
      set "ln="
      set /p "ln="
      set "out=!out!!ln!"
    )

    %= Write the line =%
    <nul set /p "=!out!!LF!"
  )
)
dbenham
  • 127,446
  • 28
  • 251
  • 390
  • It removed CR from the file but replaced it with LF as.... For the first line, it replaced CR LF with LF --- That is OK For other record lines (from second line), it replaced CR with LF-- Here we just want to remove CR (replace with nothing). – Vik Oct 01 '15 at 14:18
  • @Vik - I tested with the sample input provided in your question, and it generated the requested output, with only LF at the end of each line. So I am at a loss as to what else you expect?! Perhaps you should edit your question to indicate where the CR and LF are in your source file (use `\r` and `\n`). – dbenham Oct 01 '15 at 15:26
  • sorry for the confusion, I tried to attach screenshots with CR and LF displayed from notepad++ while posting the question but it said that I do not have enough reputation points to do so. I will edit my question to show LF/CR. One again sorry for the confusion. – Vik Oct 01 '15 at 16:18
2

Can you live with an empty first line?

@echo off
SetLocal EnableDelayedExpansion
(
  for /f "delims=" %%a in (myFile.tsv) do (
    set "line=%%a"
    if "!line:~0,1!" neq " " (
      echo(!newline!
      set "newline=!line!"
    ) else (
      set "newline=!newline!!line!"
    )
  )
  echo(!newline!
)>newMyFile.tsv  
Stephan
  • 53,940
  • 10
  • 58
  • 91
2

Perhaps something like this works for you, it simply replaces all CR with nothing.

setlocal DisableDelayedExpansion
for /F "usebackq" %%C in (`copy /Z "%~dpf0" nul`) DO (
    for /F "delims=" %%L in (myFile.tsv) do (
      set "line=%%L"
      if defined line (
          setlocal EnableDelayedExpansion
          echo(!line:%%C=!
          endlocal
      ) ELSE echo(
    )
)
endlocal
jeb
  • 78,592
  • 17
  • 171
  • 225
  • +1, but why not move the `setlocal disableDelayedExpansion` out of the loop so you only need one toggle within the loop? – dbenham Oct 01 '15 at 18:29
  • @dbenham It only toggles once, as it's in the outer loop, but you are still right, it's better for visibility to move it – jeb Oct 01 '15 at 19:10