I am currently evaluating Deedle for my project, and regarding that I have done some rudimentary performance tests. I have attached the code I used to do this. Basically it's doing two operations:
- generate two 1000x10 data frames and multiplying with each other once
- generate two series (length 1000) series and multiply them with each other 10 times
Execution time of these operations (only the calculation, not the generation) is measured with StopWatch. I would expect the execution time of these operations to be somewhat similar. However, on my machine the execution time for series ~10ms and for data frames it's ~200ms, so calculating with data frames is ~20x slower. I'm running this in .NET Core 2.1, Deedle 2.0.0-beta01, FSharp.Core 4.5.2, I got similar results with Deedle 1.2.5 as well.
Is there something that I'm doing wrong or is this just an issue with the library or its C# interface, or could there be some other reason for this?
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using Deedle;
namespace DeedleTest
{
class Program
{
static void Main(string[] args)
{
const int countCols = 10;
const int countRows = 1000;
Frame<DateTime, int> df1 = GenerateFrame(countCols, countRows, false);
Frame<DateTime, int> df2 = GenerateFrame(countCols, countRows, false);
var stopwatch = Stopwatch.StartNew();
df1 = df1 * df2;
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Series<DateTime, double> ser1 = GenerateSeries(countRows);
Series<DateTime, double> ser2 = GenerateSeries(countRows);
stopwatch.Reset();
stopwatch.Start();
foreach (int i in Enumerable.Range(0, countCols))
{
ser1 = ser1 * ser2;
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Console.ReadKey();
}
private static Frame<DateTime, int> GenerateFrame(int countCols, int countRows, bool randomNumbers = true)
{
var seriesList = new List<Series<DateTime, double>>();
foreach (int col in Enumerable.Range(0, countCols))
{
Series<DateTime, double> series = GenerateSeries(countRows, randomNumbers ? null : (double?)col);
seriesList.Add(series);
}
return Frame.FromColumns(seriesList);
}
private static Series<DateTime, double> GenerateSeries(int countRows, double? number = null)
{
var randgen = new Random();
var startDate = DateTime.Now;
var builder = new SeriesBuilder<DateTime, double>();
foreach (int row in Enumerable.Range(0, countRows))
{
builder.Add(startDate.AddSeconds(row), number == null ? randgen.NextDouble() : (double)number);
}
return builder.Series;
}
}
}