1

I have written the code to concatenate sample files into a single file minus the headers each file.

Input files:

File1:

[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14

File 2:

[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43

Expected Output:

[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14

Actual Output:

[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14

Please find below code used for this operation:

@echo off
break>Combined.csv
cls
setlocal enabledelayedexpansion

if exist C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv del C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv

dir /a-d /b C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv>C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\dirfiles.txt

cd C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\

for /f "tokens=*" %%A in (C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\dirfiles.txt) do (
    set /p header=<%%A
    if "!header!" neq "" (
        (echo(!header!)>Combined.csv
        goto :break_for
    )

)
:break_for

for /f "tokens=*" %%A in (C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\dirfiles.txt) do (
        more +1 %%A>>Combined.csv
   )

del dirfiles.txt
}

Can someone please help me resolve this issue. I am a neophyte to batch scripting and unable to debug this issue.

aschipfl
  • 33,626
  • 12
  • 54
  • 99
kartikeya_aj
  • 25
  • 1
  • 6
  • 1
    Please learn how to format code portions properly; use the `{}` button in the editir region... – aschipfl Mar 31 '16 at 06:45
  • 1
    Duplicate of http://stackoverflow.com/a/19592600/3664960 – Magoo Mar 31 '16 at 07:39
  • I improved formatting of the same CSV files -- see my [edit](http://stackoverflow.com/q/36325776/5047996/3); note that I removed a truncated line from sample file 2, because I considered it a copy-paste error, and that line did not occur in the sample output files; if I did something wrong, feel free to edit the post once again... – aschipfl Mar 31 '16 at 08:27
  • 1
    Possible duplicate of [Windows Batch file execution error](http://stackoverflow.com/questions/36057140/windows-batch-file-execution-error) – aschipfl Apr 08 '16 at 10:53

3 Answers3

1

A couple points about this question:

  • This question is an exact duplicate of Windows Batch file execution error
  • At that question there are 4 answers, one of which is mine.
  • In my answer I asked you to post a small section of your data files, but you never replied.
  • This is a copy of my answer at that question after I slightly modified it in order to insert the key point of your problem: the headers contain TWO lines:

EDIT: I modified the code accordingly to the new specifications posted in a comment: there are three lines of headers in each file, but just the 3rd must be included in the output.

@echo off
setlocal enabledelayedexpansion

cls

REM cd C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\

set "header3="
(for %%A in (*.csv) do (

   if not defined header3 (
      (set /p "header1=" & set /p "header2=" & set /p "header3=") <%%A
      echo !header3!
   )

   more +3 %%A

)) > Combined.txt
  • And this is the generated Combined.txt file when this program run with your data above:

.

[ Row : Header ],,,,,,,,,
ContractNum,ProgramNum,CustomerNum,TierNum,StartDate,EndDate,DateCreated,CreatedBy,DateUpdated,UpdatedBy
00032116,21238,60304PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00032116,21238,81790PRMI,3,2014-05-02,2017-09-30,Administrator,Administrator,2016-02-29 10:46:14,2016-02-29 10:46:14
00024067,15562,9942PRMI,1,2014-09-16,2016-12-31,gintgUser,gintgUser,2016-02-21 05:59:43,2016-02-21 05:59:43

As you can see, the output is the same you want.

EDIT: I can't test the modification because the posted input files does not contain the same data as the real files...

  • You should follow up the questions you post and not post new questions with the exact same problem of a previous one.
  • You should be clearer in the description of your problem and post an example data.
Community
  • 1
  • 1
Aacini
  • 65,180
  • 12
  • 72
  • 108
  • @aanici: As you said that I need to repost my doubt as a different question I did the same. I am sorry if this infringes the rules of the forum i was unaware of the same should i remove this question? Also, Thank you for your help. are actually three header lines the first is a white space followed by the two row header and data from 4th line. I have to skip the first two lines and pick header from 3rd i have tried the following{for /f "tokens=* skip=2" %%A in (C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\dirfiles.txt) do ( } but it doesn't work – kartikeya_aj Apr 01 '16 at 06:14
  • 1
    **1.** I _never_ said that you need to repost a different question! I said: _"post a small section of your files... Please, edit the question, do NOT post additional data in comments!"_ (you may reread [my comment](http://stackoverflow.com/questions/36057140/windows-batch-file-execution-error/36064379#comment60158294_36064379)). It is bad netiquette to just abandon an open question with no further replies (like a conversation). – Aacini Apr 03 '16 at 19:51
  • 1
    **2.** Perhaps you didn't realized yet that the core point of this question is about the headers: there are three lines of headers and just the 3rd must be included in the output. However, such an important information does _not_ appear in the question, but in comments (_"please, do NOT post additional data in comments!"_). You should include that information in **THIS QUESTION** (_NOT_ in a new one). How? _"Edit the question"_. How? Via the "edit" grey link that appear below the question, immediately below the `windows` and `batch-file` blue tags (between "share" and "close"). – Aacini Apr 03 '16 at 19:52
  • 1
    **3.** If your data files have three lines of headers (the first one is empty), why such lines does _not_ appear in your posted input files? The purpose of post example data is that we can access _the SAME data_ you have. If you modify your data when you post it, the posted data serve for nothing... **4.** I don't understand why you ask me about a `for /f` command that does _NOT_ appear in _my code_! Anyway, if you just said that "are actually three header lines", why did you used "skip=2"? ("three" is _not_ "2"). **5.** I modified my solution accordingly to the _new_ headers specifications. – Aacini Apr 03 '16 at 19:53
  • Thanks for these comments will keep them in mind and not to repeat them as well :) . The solution works too. I think that the white space got edited out by someone.I'll add those.thanks for your help – kartikeya_aj Apr 04 '16 at 11:19
0

There is no need for an interim file that contains a list of CSV files, you can read and combine them by a standard for loop and a nested for /F loop, using its skip option to get rid of the headers (assuming the header is always a single line). The initial header can be taken from another for/for /F loop construct that is broken upon its first iteration:

> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
    for %%F in ("C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv") do (
        for /F "usebackq eol=| delims=" %%L in ("%%~F") do (
            echo(%%L
            goto :LEAVE
        )
    )
)
:LEAVE
>> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
    for %%F in ("C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv") do (
        for /F "usebackq skip=1 eol=| delims=" %%L in ("%%~F") do (
            echo(%%L
        )
    )
)

If you need a specific sort order of the CSV files, you need another for /F loop instead of the standard for loop that parses the output of a dir /B command to do that job. The following example takes a two-line header, then it sorts the files from oldest to newest modification dates:

> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
    set "FLAG="
    for %%F in ("C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv") do (
        for /F "usebackq eol=| delims=" %%L in ("%%~F") do (
            echo(%%L
            if defined FLAG goto :LEAVE
            set "FLAG=#"
        )
    )
)
:LEAVE
>> "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\Combined.csv" (
    for /F "eol=| delims=" %%F in ('
        dir /B /A:-D /O:D /T:W "C:\Users\kartikeya.avasthi\Desktop\Batch_Scripts\ContractEligibility_*.csv"
    ') do (
        for /F "usebackq skip=2 eol=| delims=" %%L in ("%%F") do (
            echo(%%L
        )
    )
)
aschipfl
  • 33,626
  • 12
  • 54
  • 99
  • 1
    thanks for the above is there some way to pick up the header in the first for loop from the third line onwards. Also am new as a user to the community so thanks you for the formatting updates much appreciated. – kartikeya_aj Mar 31 '16 at 09:20
  • 1
    @kartikeya_aj, so the header spans over lines 1 and 2 (like your sample data show)? see my [edit](http://stackoverflow.com/a/36327709/5047996/2) (coming soon)... – aschipfl Mar 31 '16 at 10:55
  • Alright... the samples in your question show two rows/lines, so it doesn't make sense to adapt my answer... anyway, you got the idea how to do it; you could also use a counter, say `COUNT`, whih you increment in the 1st loop like `set /A COUNT+=1`, and then leave the loop conditionalls like `if !COUNT! EQU 3 goto :LEAVE`; as you can see (`!COUNT!`) you'll need delayed expansion then... – aschipfl Apr 01 '16 at 10:44
  • Thanks for this solution – kartikeya_aj Apr 04 '16 at 11:20
0

If you felt like installing awk - one of the handiest programs around from Unix/Linux - your task would become very simple. It is available for Windows from here.

Then you could just use:

awk  'NR<3 || FNR>2'  *.csv

To explain the command, you need to know that NR is the Number of the Record (i.e. the line number) and it starts at one for the first record/line of the first file and then increments with each record, so it will be less than 3 for just the first two records of just the very first file. FNR on the other hand, is the File Number of Record which is the same, but it resets to one as each new file is opened, so it will be less than 2 for the first two records of every file.

So, in summary, the command says... "Print any line if it is one of the very first two lines of all the input files, or if it is past line 2 of any of the files."

Note that you may need to replace the single quotes with double quotes on Windows.

Note that if you were to download gawk, it will work just the same as awk for this example.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432