2

I have an issue with fragmentation on my drive. I got a programm that generates over 50000 files in different folders, each file grows over time. Each file will be about 500MB in size and I need to read the files fast. The issue I am facing is that each file will be spread over the drive and defragmenation would take over 4 weeks.

I heard about a filesystem that will spread each file on the drive so that the gap between each file is the same sice. I searched the internet for that filesystem but i couldn't find anything.

My program is written in Java, maybe there is a way to set the beginning of a file on a specific byte position on the drive.

I would be glad if someone could help me facing this issue.

Tobias Geiselmann
  • 2,139
  • 2
  • 23
  • 36
  • 4
    There is nothing you can do about that using Java. In fact Java is meant to keep the programmer away from such system dependent stuff. – Timothy Truckle Oct 09 '17 at 11:59

4 Answers4

1

I heard about a filesystem that will spread each file on the drive so that the gap between each file will be the same sice. I searched in the internet for that filesystem but i coudn't find anything.

Most likely you did not because it does not exist...

But we have RAID systems (Rapid Array of Inexpensive Disks) which could ease your pain...

Timothy Truckle
  • 15,071
  • 2
  • 27
  • 51
0

As Timothy said, you can't get to that level by using Java.

I neither heard that filesystem, it hasn't got much logic though.

Perhaps, in the case that you are storing text, you can use a NoSQL database (like MongoDB) that stores data in binary size. Probably you'll get good speeds, and the Java connector is easy to use.

webo80
  • 3,365
  • 5
  • 35
  • 52
0

Use a Linux filesystem like ext4 where disk fragmentation is very low but also make sure you have plenty of disk space left else fragmentation will happen anyway.

0

I also don't know of a file system that does this. However I have some info that may help-

If you used an SSD, then fragmentation would be less of a concern for reading performance reasons. SSDs store data in chunks - NAND flash pages, 16 KB for instance. These are always stored in scattered order due to the wear-levelling algorithm used. That is very unlike how hard disks work in practice. Pages on SSDs are accessed in a very parallel fashion as well. As a result, you would have much less impact of fragmentation on reading performance with an SSD. Fragmentation would still have some penalty for writes/deletions.

RAID would also allow for higher performance on reads as Timothy mentions.

Michael K
  • 1,031
  • 2
  • 14
  • 27