Alternative to opening large number of file streams in C#

Question

In Unity, I am producing a project which procedurally builds a (particularly complicated) world, and stores all generated data in disk files for future use. I've got the file size down to 8 KB per world chunk, and might be able to make it even smaller; but there's an additional cost to opening and closing so many file streams in rapid succession.

On start up, I've got 2,970 chunks created. I've got loading time down to approximately 20 seconds on an FX-8300 cpu with a reasonably fast HDD. Making the files smaller is not likely to help me here; I seem to be running into a fixed cost on opening and closing file streams (multiplied by nearly 3,000!)

So, I'm looking for an alternative. Most of my recent programming experience is in Java, Python, JavaScript, and D; so I may be missing an elephant in the room. LTS is most certainly going to have to be local. Is it possible to accelerate FileStreams, throwing them in some kind of Object Pool? Or can I use some kind of SQLite system? Is there something even better out there?

Unity seems to restrict me to .NET 2.0 features at the moment, but massive quantity file management is a fairly common task (in a broader sense) and I can't help but feel that I'm doing this in a naive way.

Thanks for any and all input!

There's a lot of code, but the relevant part is probably this. Let me know if you need to see anything else.

public bool record(BlockData data) {
    string name = BuildChunkFileName(data.Origin);
    try {
        FileStream stream = new FileStream(BuildChunkFilePath(data.Origin), FileMode.OpenOrCreate);

        ushort[] arrayData = new ushort[Chunk.blockSize * Chunk.blockSize * Chunk.blockSize];

        int index = 0;
        foreach(BlockProperties props in data.Properties) {
            arrayData[index] = props.isOpaque ? (ushort)1 : (ushort)0;
            index++;
        }

        byte[] byteData = new byte[(Chunk.blockSize * Chunk.blockSize * Chunk.blockSize) * sizeof(ushort)];
        Buffer.BlockCopy(arrayData, 0, byteData, 0, (Chunk.blockSize * Chunk.blockSize * Chunk.blockSize) * sizeof(ushort));
        IAsyncResult result = stream.BeginWrite(byteData, 0, byteData.Length, null, null);

        while(!result.IsCompleted) {
            Thread.Sleep(100);
        }

        stream.Close();
    } catch(Exception e) {
        Debug.LogException (e);
        return false;
    }
    return true;
}

public bool read(BlockData data) {
    int x = 0, y = 0, z = 0;
    int i = 0;

    string name = BuildChunkFileName (data.Origin);
    string path = BuildChunkFilePath (data.Origin);

    try {
        FileStream stream = new FileStream(path, FileMode.Open);

        byte[] byteData = new byte[(Chunk.blockSize * Chunk.blockSize * Chunk.blockSize) * sizeof(ushort)];
        IAsyncResult result = stream.BeginRead(byteData, 0, byteData.Length, null, null);

        while(!result.IsCompleted) {
            Thread.Sleep(100);
        }

        ushort[] arrayData = new ushort[Chunk.blockSize * Chunk.blockSize * Chunk.blockSize];
        Buffer.BlockCopy(byteData, 0, arrayData, 0, byteData.Length);

        for(i = 0; i < arrayData.Length; i++) {
            x = i % Chunk.blockSize;
            y = (i / (Chunk.blockSize)) % Chunk.blockSize;
            z = i / (Chunk.blockSize * Chunk.blockSize);
            data.Properties [x, y, z].isOpaque = arrayData [i] == 0 ? false : true;
        }

        stream.Close();
    } catch(Exception) {
        // a lot of specific exception handling here, the important part
        // is that I return false so I know there was a problem.
        return false;
    }

    return true;
}

If you need fast access to a file , try a memory mapped file, if you need query-able access to relational data, a database might be the best way, it all depends on the hows and whys — TheGeneral, Feb 06 '18 at 01:10
I should note that I do intend to ultimately open the stream with a using statement; but one thing at a time. — Michael Macha, Feb 06 '18 at 01:17
the fastest way to read binary file is to read the whole file at once with File.ReadAllBytes as described here: https://stackoverflow.com/a/10239650/7821979 (and yes, it has its write counterpart) — 5ar, Feb 06 '18 at 01:30
also `Thread.Sleep(100)` is a terribly inefficient way to wait for a result, check if your are able to use the async-await C# syntax tor async operations — 5ar, Feb 06 '18 at 01:34
Furthermore, if you have a large number of files, it might be beneficial to look into some document-oriented database (MongoDB for instance has GridFS that is used of binary data chunks) that will optimize the read and write part for you and you only need to handle the communication with the database — 5ar, Feb 06 '18 at 01:38
I'll group my comments together as an single answer to maintain readability in case there is a discussion. — 5ar, Feb 06 '18 at 01:55

5ar · Accepted Answer · 2019-01-30T22:55:14.550

3

The fastest way to read a binary file is to read the whole file at once with File.ReadAllBytes as described here. The method has its write counterpart as well.

Using Thread.Sleep(100) is a terribly inefficient way to wait for a result. I'm not familiar with Unity, but check if your are able to use the async-await C# syntax (along with the Task object) for asynchronous operations.

Furthermore, if you have a large number of files, it might be beneficial to look into some document-oriented databases that will optimize the read and write part for you and you only need to handle the communication with the database. MongoDB for instance has GridFS that is used for binary data chunks, but there might be some document databases that are even better suited for your use case.

Considering that you don't have any relations, there is no point in using a SQL database for your problem. However, using something like SQLite still might be better than using multiple files.

edited Jan 30 '19 at 22:55

answered Feb 06 '18 at 01:53

5ar

2,069
10
27

Thank you, I'll look into that! This is early code; I don't like the consistent 100 ms wait in Thread.sleep. I'll look into an asynchronous wait as well. – Michael Macha Feb 06 '18 at 03:08
It's everything I was looking for. It got rid of that new statement, and I'm pretty sure that was a large part of the slowdown. Incidentally your method also removed the need for a Thread.Sleep. I should point out that outside of the scope of this method, I do actually have relations in my code--chiefly between investigated coordinates and chunk data through a purely determinate Perlin noise function. I still feel like there might be some benefit to SQLite, if only to be managing a single file. – Michael Macha Feb 06 '18 at 03:30

Alternative to opening large number of file streams in C#

1 Answers1