2

Background

First, I have a text file (CSV) with a few columns and hundred thousands of rows. The total file size is 10MB. I used resource with Unity so it loads as Text

The related code is:

TextAsset txtData = Resources.Load("data.csv") as TextAsset;
string txt = txtData.text;
strReader = new StringReader(txt);
string line0 = strReader.ReadLine();

....

currentLine =strReader.ReadLine();
while (true) {// if last line is nothing we quit
    var values = currentLine.Split(',');    
    try {
        float x = (float)Convert.ToSingle(values[colX]);
        float y = (float)Convert.ToSingle(values[colY]);
        float z = (float)Convert.ToSingle(values[colZ]);
        float w = (float)Convert.ToSingle(values[colSize]);
        runningList.Add(v1);
    }catch(Exceptoion e){
    }
    currentLine = strReader.ReadLine();
}

Problem

It was found that the reading plus parsing is slow so that it affects the Unity visual effect. So I used log file to see. I count time for every 500 rows. Strange enough, the last group takes 12ms (500 rows), the second from last takes 20ms, the time is linearly increasing to 1.5-1.7 seconds for the first group.

More Info

When Unity is drawing at 90 Hz, I am using a thread to read the string and parse the data.

Question

Where should I look for problems? I used Unity resource, string reader, split, parsing to float. Where is the cause and is there a way to improve?

It looks strange as the time reduces.

Update

after I used file stream reader, it is 2ms each group. So it is Unity TextAsset?

Splash
  • 1,288
  • 2
  • 18
  • 36
  • 1
    Your code can be tidied up by changing it to `while( ( currentLine = strReader.ReadLine() ) != null )` and avoiding the rest of the last-line checks. – Dai Sep 29 '16 at 02:44
  • Also, what does a typical line in your CSV file look like? If it contains a lot more text than the four columns you're extracting then you're probably causing lots of unnecessary string allocations (`String.Split` is not cheap). Consider using a finite-state-machine parser and only allocating new strings for relevant columns. – Dai Sep 29 '16 at 02:45
  • 1
    Finally, never do `catch(Exception)` - and *especially don't swallow them*. Your code might be slow on failing reads because exceptions are expensive. Use `Single.TryParse` instead of `Convert.ToSingle`, which will not throw an exception when it cannot parse a string value. – Dai Sep 29 '16 at 02:46

1 Answers1

1

Given the behavior, it's almost the C# bug reading Linux text files.

C# expects \r\n but Linux has only \n. Then it is reasonable that each line read will go through the whole file and found out it is Linux file and the time to parse lines will be proportional to the remaining file size

OLNG
  • 163
  • 6