1

I have various amounts of text files that need to have the first 26 lines deleted. I have tried the below bat but it doesn't want to even finish the first text file. The files are named data (1).txt, data (2).txt, data (3).txt, ... etc.

At first I tried...

more +26 "data (1).txt" > "data (1).txt.new"
move /y "data (1).txt.new" "data (1).txt"

This worked but it would be extremely time consuming to change each number seeing as I have ~100 text files.

So then tried to do the below.

for %%f in (*.txt) do (
more +26 "%%f" > "%%f.new"
move /y "%%f.new" "%%f")

To me it seems like this should work but it's not, it just pulls up the command line and stalls on the first file, it does create the "NEW" file but looks like it only copied half of the original text file. The files are anywhere from 1MB to ~300MB each.

So my question is simple.. What am I doing wrong and can anyone provide help/tips?

UPDATE

So I've been continuing to play with the second option and it seems to work for files up to ~125MB anything over that and it just pauses and doesn't complete the operation. Not sure if there is a fix for that or possibly a better option then using a batch file. Again any help is appreciated.

UPDATE

I was able to get what I was looking for through JAVA.

sadd

import java.io.bufferedreader;
import java.io.file;
import java.io.filereader;
import java.io.filewriter;

public class cleanfiles {
  public static void main(string[] args) throws exception {
    string currdir = system.getproperty("user.dir");
    file inputdir = new file(currdir + file.separator + "input" + file.separator);
    file[] inputfiles = inputdir.listfiles();
    
    String outputdir = currdir + file.separator + "output" + file.separator;
    for (file inputfile : inputfiles) {
      if (inputfile.getabsolutepath().endswith(".txt") == false) {continue; }
      file outputfile = new file(outputdir + inputfile.getname() + ".csv");
      bufferedreader reader = null;
      try {
        reader = new bufferedreader(new filereader(inputfile));
        writer = new filewriter(outputfile);
        
        string line;
        while ((line = reader.readline()) !=null) {
          if (line.startswith("Point")) {
            writer.append(line);
            writer.append("\r\n");
            break;
            }
          }
        while ((line = reader.readline()) !=null) {
          writer.append(line);
          writer.append("\r\n");
          }
        } catch (exception e) {
        } finally {
            try {
              reader.close();
              writer.flush();
              writer.close();
            } catch (exception e) {}
          }
      }
    }
}
Matt
  • 21
  • 1
  • 4

2 Answers2

1

I recommend using sed for Windows. You'll need the binaries and the dependencies linked from that page. Then you can just sed "1,26d" infile >outfile in a for loop from the command line to delete the first 26 lines of your files. No batch file needed.

for %I in (*.txt) do (sed "1,26d" "%I" >"%I.1" && move /y "%I.1" "%I")

Note: There is a -i switch for gnuwin32 sed (for inline processing) which would make the syntax a bit simpler, but last time I tried it it left a garbage file for each real file it processed. I recommend not using it.

I know from painful experience that using a stream processing application to handle large text files is MUCH faster than batch script trickery and for /f loops.

If you want to avoid using gnuwin32 sed and would prefer to use powershell, see this question's accepted answer for a worthwhile method to try. No clue whether it'd be as fast or faster than sed, though. Bill_Stewart seems enthusiastic about it. :)

Community
  • 1
  • 1
rojo
  • 24,000
  • 5
  • 55
  • 101
  • No need to download anything. Forget about using cmd.exe shell scripting (batch) `for /f` ugliness and just use PowerShell. – Bill_Stewart Nov 13 '14 at 15:53
0

If you notice the last line of your output file, you'll notice the limitation of your approach. When the number of lines exceed ~65535, MORE hangs, waiting for a key press from the user.

enter image description here

You can use a for loop instead:

for %%I in (*.txt) do for /f "delims=, tokens=* skip=26" %%x in (%%I) do echo %%x >> "%%I.new"
Chirag Bhatia - chirag64
  • 4,430
  • 3
  • 26
  • 35
  • Attempted with what you posted and it simply opens the command prompt for a second and closes with out completing anything... – Matt Nov 12 '14 at 19:35
  • Forgot to mention, it seems to work only on text files encoded in ANSI and UTF-8, not Unicode. Try converting your files to UTF-8 using a notepad. – Chirag Bhatia - chirag64 Nov 12 '14 at 19:39
  • Or if you've too many files, you could run a for loop on all files with the `type` command to convert all Unicode files to ANSI. Something like `for %%I in (*.txt) do type "%%I" > %%I.new` and then using `move` to replace the original files with the new files like the code in your question. Please note that this is an option only if your text files do not contain special characters (mainly non-English characters), otherwise you may end up losing data during the conversion. – Chirag Bhatia - chirag64 Nov 12 '14 at 20:01
  • Or if op could use sed for Windows, it could be done with a one-liner from the command line, no script needed. – rojo Nov 12 '14 at 23:34
  • Simple in PowerShell also. No need to struggle with cmd.exe shell script (batch). – Bill_Stewart Nov 13 '14 at 00:18