0

I am trying to figure out how to store data that can be easily/heavily edited.

Reading data from a big single file isn't really a problem. The problem starts when I need to make changes to that file.

Let's say I have a bit log file which always appends a string to the file. The Filesystem needs to recreate the whole file since it has changed. And the bigger the File the heavier the performance cost.

What I could do is simply create a new file for each log. Creating, removing and editing would be more efficient. Until I would like to copy all these files lets say on a new SSD.

Reading directories and copying thousand of files, even if they are small, hits hard on performance. So maybe bundle all files into a single file/archive?

But then AFAIK archive like .zip ... also needs to be recreated when something changed.

Is there a good or maybe even simple solution to this?

How does a single file database like SQlite handle this?

Mention: I am using C#

Rrrrrr
  • 50
  • 5
MIkey
  • 21
  • 1
  • 2
  • For your Log example, I find that [File.AppendText(String)](https://learn.microsoft.com/en-us/dotnet/api/system.io.file.appendtext?view=net-5.0) works just fine. It uses a stream under the hood, so it simply seeks to the end of the file and adds the text. – Robert Harvey Sep 27 '21 at 14:31
  • Your assumption, that a file must be re-created, when something is changed or added is wrong. – Oliver Sep 27 '21 at 15:14
  • @Oliver Okay, thanks for clearing this up to me! Would u mind telling me where i can get a better understanding of this matter? Or explain it yourself if its not to much to ask for. – MIkey Sep 27 '21 at 15:45
  • In general it depends on a filesystem how files are organized and on the drive controller how these bytes are physically stored. Today the most file systems are organized in blocks or nodes. These can be seen similar to a linked list, which allows adding, removing or inserting blocks as needed without rewriting everything. Hard drives itself have own techniques to increase I/O performance and lifetime like caches, wear leveling, round-robin access, deduplication, reserved spaces and many more. So don't think to hard about these things and try to write good readable code. – Oliver Sep 28 '21 at 05:24
  • And if you really run into performance problems come back with a more concrete question (e.g. I measured an I/O performance of 100MB/sec by using approach ..., but I need 1GB/sec) – Oliver Sep 28 '21 at 05:25
  • Another maybe helpful article could be https://www.codeproject.com/Articles/17716/Insert-Text-into-Existing-Files-in-C-Without-Temp – Oliver Sep 28 '21 at 05:26

0 Answers0