I wanted to read lines from the CSV File and use RX.Net to do some transformation and I wanted to do batch update and send the update every 250 milliseconds
public static IEnumerable<string> ReadCSV(string filePath)
{
var reader = new StreamReader(File.OpenRead(filePath));
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
yield return line;
}
}
var rows = ReadCSV("filePath").ToObservable();
rows
.Buffer(50)
.Zip(Observable.Interval(
TimeSpan.FromMilliseconds(250)), (res, _) => res)
.Subscribe(lines =>
{
//do something
});
I use csv file with around 80mb, but the console project goes up to 1gb.
What happening here is the Zip is waiting for both sequence to give it signal. Csv sequence is giving the data very fast, so it is storing the batch updates in memory and waiting for the other sequence.
What makes it even worse is that, the memory is not released even all the updates are being processed. If I remove the Zip, memory looks very good, it looks like it's releasing the memory when the batch is being processed (the whole app just take around 20mb entire time).
Two questions
Is there a way to tell the observable I want to pause the read until the previous one is processed(in my case is the buffered lines).
Why the memory is not released after all the updates are being processed, is there a way to avoid this?