-1

I want to remove data that is in the middle of the file, because the file is large, I would like to avoid having to re-write the entire file. To move the data i am trying to: Read from end of data(byte position) to end of file --> to the beginning of data (byte position), then truncate file to file size - data size.

I have this code, but i cant get it to read and write to the same file at the same time...

' Read from end of data in file to byte position of start of data in file
Using inFile As New System.IO.FileStream(NAME, IO.FileMode.Open, IO.FileAccess.Read)
    Using outFile As New System.IO.FileStream(NAME, IO.FileMode.Open, IO.FileAccess.Write)
        inFile.Seek((START_POS + DATA_SIZE), IO.SeekOrigin.Begin)
        outFile.Seek(START_POS, IO.SeekOrigin.Begin)
        Do
            If FILESIZE - CURRENT_TOTAL_READ < BUFFER_SIZE Then
                ReDim BUFFER((FILESIZE - 1) - CURRENT_TOTAL_READ)
            End If
            BYTES_READ = inFile.Read(BUFFER, 0, BUFFER.Length)
            If BYTES_READ > 0 Then
                outFile.Write(BUFFER, 0, BYTES_READ)
                CURRENT_TOTAL_READ += BYTES_READ
            End If
        Loop While BYTES_READ > 0 AndAlso CURRENT_TOTAL_READ < (DATA_SIZE- 1)
    End Using
End Using
TOTAL_PROCESSED_DATA += CURRENT_TOTAL_READ
' truncate file to file size - data size
TRUNCATE(NAME, (FILESIZE - DATA_SIZE))
Daniel Valland
  • 1,057
  • 4
  • 21
  • 45
  • 2
    Nope. You are going to have to re-write the file. Read it in, write it out, leave out the parts you don't want to keep. – LarsTech Jun 29 '14 at 13:37
  • @LarsTech I dont't see why I have to re-write the entire file in any case... I could write the end part of the file to a temporary file, truncate the origional file, then append the temporary file back to the origional file, but that is a very messy way of doing it. Mostly because I would rather not use a large amount of temporary storage space in temp data. But it could be the only way of doing it... – Daniel Valland Jun 29 '14 at 13:42
  • Read about memory mapped files, maybe it's an option – VladL Jun 29 '14 at 16:51

1 Answers1

0

You can use one filestream and set its position before each read and write.

Points against this:

  • if it goes wrong (e.g. power failure), you are going to have a big mess to clean up
  • if you are using a hard disk drive, it will be a bit cruel to it with all the seeking if you don't have an appropriate BUFFERSIZE

Having said that, the following only takes a couple of seconds to cut a small part out near the start of a 1GB file on an SSD or less than ten seconds on an HDD.

N.B. I did a casual check that it works correctly, but there might be an off-by-one error in there somewhere.

Imports System.IO

Module Module1

    Sub CreateTestData(filename As String)
        If File.Exists(filename) Then
            File.Delete(filename)
        End If

        ' put AAAA....BBBB... etc at the start of the file for easy inspection
        Using sw As New StreamWriter(filename)
            For c = Asc("A") To Asc("D")
                Dim s = New String(Chr(c), 1024)
                sw.Write(s)
            Next
        End Using

        ' Make the file 1GB in size.
        Using fs As New FileStream(filename, FileMode.Open, FileAccess.ReadWrite, FileShare.None)
            fs.SetLength(1024 * 1024 * 1024)
        End Using

    End Sub

    Sub CutMiddleOutOfFile(filename As String, cutPos As Int64, cutLength As Int64)
        If cutPos < 0 OrElse cutLength < 0 Then
            Throw New ArgumentException("Cut parameters must be positive.")
        End If
        'TODO: More argument checking regarding cutPos, cutLength, and the length of the file.

        ' Use a fairly large buffer
        Const BUFFERSIZE As Integer = 1024 * 1024

        ' Let FileStream decide its own internal buffer size.
        Using fs As New FileStream(filename, FileMode.Open, FileAccess.ReadWrite, FileShare.None)

            Dim buffer(BUFFERSIZE) As Byte
            Dim bytesRead As Integer = Integer.MaxValue
            Dim currReadPos As Int64 = cutPos + cutLength
            Dim currWritePos As Int64 = cutPos

            While bytesRead > 0
                fs.Position = currReadPos
                bytesRead = fs.Read(buffer, 0, BUFFERSIZE)

                If bytesRead > 0 Then
                    fs.Position = currWritePos
                    fs.Write(buffer, 0, bytesRead)
                End If

                currReadPos += bytesRead
                currWritePos += bytesRead

            End While

            fs.SetLength(currWritePos)

        End Using

    End Sub

    Sub Main()
        Dim filename = "D:\temp\a.dat"
        CreateTestData(filename)
        CutMiddleOutOfFile(filename, 2048, 1024)
        Console.WriteLine("Done.")
        Console.ReadLine()

    End Sub

End Module
Andrew Morton
  • 24,203
  • 9
  • 60
  • 84