0

With reference to my previous post

Remove first line from a delimited file

I was able to process smaller files and remove the first line .... but incase of huge files there is an issue of memory as I am reading the whole file and then writing it back again.

Can anybody suggest a better alternative to this issue.

Thanks for Advance.

Vivek

Community
  • 1
  • 1
Vivek
  • 41
  • 1
  • 5
  • Deleting the first line involves re-writing the entire file. As the file grows, the number of lines increases but so to does the size of the file to re-write. i.e. this is an O(n^2) operation. You should consider another approach as others have suggested already. – Peter Lawrey Jul 18 '11 at 11:46
  • Yes , exactly. that is the problem , re-writing the whole file again is a expensive task. And if the size of the file runs into GB's , then its definitely a problem. I have tried FileChannel and RandomAccessFile , but it proves to be inefficient in case of huge files. And using BufferedReader / PrintWriter would also involve re-writing the whole file again , which I would like to avoid. I am not sure about any other approach , can you suggest one. – Vivek Jul 18 '11 at 11:59

2 Answers2

1

You have to read file line-by-line and write it on place:

BufferedReader reader = new BufferedReader(new FileReader("foo.txt"));
PrintWriter writer = new PrintWriter(new FileWriter("_foo.txt"));

String line;
boolean firstLine = true;

while ( (line = reader.readLine()) !=null) {
    if (!firstLine) {
        writer.println(line);
        firstLine = false;
    }
}
AlexR
  • 114,158
  • 16
  • 130
  • 208
0

To avoid rewriting the entire file to remove one line you can maintain an index to the "start" of the file. This index is where you believe the start to be and where you would start reading the file from. Periodically e.g. once a night, you can rewrite the file so that this "start" is where the file actually starts.

This "start" location can be stored in another time or at the start of the existing file.

This means you can progressively "remove" all the lines of a file without re-writing it at all.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130