0

I try to show a lot of values in a WPF datachart, after user was selecting target parameters. To realize this, I am using the livecharts for WPF (like this: https://lvcharts.net/App/examples/v1/Wpf/Scrollable) and it's working well. To change the values in the chart, I have to call this function:

' Change values of xAxis
Private Sub ChangeXAxis(axis As Object, title As String, values As Object)
    axis.Labels = values  ' has to be array or list of values (strings for X, double for Y)
    axis.Title = title
End Sub

The selected values have to be filtered by timestamp or parameter before showing. For this purpose, I am using the folowing function:

    Public Function FilterListForChart(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer)
    ...

    Try
        If axis = "X" Then
            'Return filtered values for axis
            Dim query As IEnumerable = (From rows In values
                                        Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                        Select (Math.Round(CDbl(rows(1)), 3))).ToList()
            
            Return query

Unfortunately the lvcharts needs values as list or array (string or double) to show the data correctly. The Problem is converting the IEnumerable to a list or array will take a long time if I want to show a lot of values (e.g. >300.000 values need something like 10 or more seconds)

Because of this, I was trying a lot of different things like discussed here:

I was trying the following options without success:

  • Use ToArray()
  • Use ToList()
  • Use allocated list with defined capacity
  • Use for each loop instead of queries
  • Filter without rounding (use reduced data processing)

During my tests, I got the following processing times:

  • Filter 180.000 values out of 180.000 total and show in chart: X = 1088ms, Y = 1085ms, Total: 2.173ms
  • Filter 180.000 values out of 1.800.000 values and show in chart: X = 9919ms, Y = 9983ms, Total: 19.902ms

To query and filter the data just take a few millisencods. The majority of the computing time goes for creating of the list/array. During my tests, I was able to reduce the computing time only by a few milliseconds.

My current solution is to use values <200.000 to show data and to load the remaining data with backgroundworker and update the gui later. But that's not a good solution. The user has to see all values in the chart to assess the data and the values have to be re-loaded if the user wants to add some paramter to chart. To filter the data during the previous SQL query is not a good alternative, because of similar durations.

UPDATE:

I did another tests with five scenarios:

  Dim yAxis1 = TestSpeed_ForEach(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis2 = TestSpeed_ForEachFixedList(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis3 = TestSpeed_QueryToArray(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis4 = TestSpeed_QueryToList(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)
  Dim yAxis5 = TestSpeed_QueryToListParallel(values_YAxis, "Y", counterStart, counterEnd, decimalCut, False)

I testet the function with stopwatch for different values from 2.300 --> 1.800.000. As described before, I was not able to speed up the calculating time very much. The function with fixed list was the fastest, but the saving was just between 50 - 400ms. Here are the results for the query of 27.500 values out of 1.800.000 total:

  • TIME: YAxis 1 - 10868ms
  • TIME: YAxis 2 - 10844ms
  • TIME: YAxis 3 - 11311ms
  • TIME: YAxis 4 - 11265ms
  • TIME: YAxis 5 - 11313ms

In the second scenario I tested the constant list with 30.000 or 500.000 entries. But this only has a very small effect.

Here are the used functions:

'################ TESTING ################
Public Function TestSpeed_ForEach(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim yList As New List(Of Double)

        For Each row In values
            If CInt(row(3)) > counterStart And CInt(row(3)) < counterEnd Then
                yList.Add(Math.Round(CDbl(row.ItemArray(4)), decimalCut))
            End If
        Next
        Return yList

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_ForEachFixedList(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, xIsTimeList As Boolean)
    Try
        Const capacity As Integer = 30000
        Dim yList As New List(Of Double)(capacity)

        For Each row In values
            If CInt(row(3)) > counterStart And CInt(row(3)) < counterEnd Then
                yList.Add(Math.Round(CDbl(row.ItemArray(4)), decimalCut))
            End If
        Next
        Return yList

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_QueryToList(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim query As IEnumerable = (From rows In values
                                    Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                    Select (Math.Round(CDbl(rows(4)), decimalCut))).ToList()
        Return query

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_QueryToArray(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim query As IEnumerable = (From rows In values
                                    Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                    Select (Math.Round(CDbl(rows(4)), decimalCut))).ToArray()
        Return query

    Catch ex As Exception
        Return Nothing
    End Try
End Function

Public Function TestSpeed_QueryToListParallel(values As IEnumerable, axis As String, counterStart As Integer, counterEnd As Integer, decimalCut As Integer, isTimeAxis As Boolean)
    Try
        Dim query As IEnumerable = (From rows In values
                                    Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                                    Select (Math.Round(CDbl(rows(4)), decimalCut))).AsParallel.ToList()
        Return query

    Catch ex As Exception
        Return Nothing
    End Try
End Function

What else could I do to speed up?

  • Try to just enumerate you filtered data by foreach. I don't beleieve that `ToList()` is too slow. Yes it is better to create list with proper capacity and then add items. Also take a look at [PLINQ](https://learn.microsoft.com/en-us/archive/msdn-magazine/2009/december/concurrent-affairs-data-parallel-patterns-and-plinq) – Svyatoslav Danyliv Feb 01 '22 at 11:44
  • 1
    Consider populating a list via a loop instead of using LINQ – Anu6is Feb 01 '22 at 13:08
  • 1
    Have you considered pre-allocating the underlying storage for the list? I would expect the allocation strategy to be quadratic (doubling each time it needs to expand), which is going to require a lot of reallocating and copying for a very large number of items. `List(Of T)` has a constructor that takes an initial capacity and `AddRange` that will add from `IEnumerable(Of T)`. You would be trading space for performance for the case where you don't know a priori exactly how many filtered results you will have. – Craig Feb 01 '22 at 13:59
  • Nobody ever needs to display 300k points on a chart. Even if you filled an 8k display, and used one pixel per point, you can still display at most 7680 points, and the other 290k+ points are not shown. Heavily decimate your data before charting – djv Feb 01 '22 at 17:54
  • Also you use `IEnumerable`, not the generic `IEnumerable(Of Double)`. So you are boxing the Double into an Object. This will waste time too. Let your declaration implicitly type like `Dim query =` and it might just speed up. – djv Feb 01 '22 at 17:56
  • @djv You're right. But, as in the example chart was posted, you can scroll in and out in the data values. In our application, we need to analyse a lot of values so we have to load e.g. 200.000values and than zoom in to special points and check the values. You can see an example here: [link](https://raw.githubusercontent.com/Live-Charts/WebSiteDocs/preview/v1/start/scrollable.gif) – krambambuli Feb 01 '22 at 17:57
  • @krambambuli it looks very nice, and looks like you do need all the data present – djv Feb 01 '22 at 18:10
  • @djv yes I do, and that is the problem :-/ I tested the code with implicitly types like `Dim query =...` but it doens't really help – krambambuli Feb 01 '22 at 18:12
  • 2
    @krambambuli The problem is deeper, you pass in a `values As IEnumerable` so even before the query you have boxed Doubles inside Objects. Every access is an unboxing. You need to unwind to the origin of the data and instead use `IEnumerable(Of Double)` to make any difference. – djv Feb 01 '22 at 19:18

3 Answers3

0

I agree with the comments. This should be faster ASSUMING your function returns list of double...

    Const capacity As Integer = 1024 * 512

    Dim query As New List(Of Double)(capacity)

    For Each rows In values
        If CInt(rows(3)) > counterStart AndAlso CInt(rows(3)) < counterEnd Then
            query.Add(Math.Round(CDbl(rows(1)), 3))
        End If
    Next

    Return query
dbasnett
  • 11,334
  • 2
  • 25
  • 33
0

You'll have to edit this and make corrections where there are ??? The point is to test the IEnumerable being passed to the method.

Public Function TestSpeed_ForEachFixedList(values As IEnumerable(Of ???),
                                            axis As String,
                                            counterStart As Integer,
                                            counterEnd As Integer,
                                            decimalCut As Integer,
                                            xIsTimeList As Boolean) As List(Of Double)

    Dim Nvals As List(Of ???) = values.ToList
    Try
        Const capacity As Integer = 300000
        Dim yList As New List(Of Double)(capacity)

        For Each row As ??? In Nvals
            If CInt(row(3)) > counterStart And CInt(row(3)) < counterEnd Then
                yList.Add(Math.Round(CDbl(row.ItemArray(4)), decimalCut))
            End If
        Next
        Return yList

    Catch ex As Exception
        Return Nothing
    End Try
End Function
dbasnett
  • 11,334
  • 2
  • 25
  • 33
  • I repeated the test with converted values like _values As IEnumerable(of Double)_. Of course it is much faster, but now the computing time is at creating the list before loading _TestSpeed_ForEach(...)_ Code: `Dim query As IEnumerable(Of Double) = From rows In values_YAxis Where True Select CDbl(rows.ItemArray(4))` will take zero milliseconds, to create a list of it with `Dim valList As List(Of Double) = query.ToList` will take 360ms... As you can see, we have 360ms calculating time, that nearly the same like before (400ms for getting 2.300 values) – krambambuli Feb 01 '22 at 17:44
0

After many tests I was able to find the solution with the shortest calculation time.

    Try
        If axis = "X" Then
            Dim query = (From rows In values.AsParallel
                         Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                         Order By CDbl(rows.ItemArray(3)) Ascending
                         Select (String.Format("{0:f" & decimalCut & "}", rows.ItemArray(positionToAsk)))).ToList

            Return query

        ElseIf axis = "Y" Then
            Dim query = (From rows In values.AsParallel
                         Where CInt(rows(3)) > counterStart And CInt(rows(3)) < counterEnd
                         Order By CDbl(rows.ItemArray(3)) Ascending
                         Select (Math.Round(CDbl(rows.ItemArray(4)), decimalCut))).ToList

            Return query
        End If
    Catch ex As Exception
        Return Nothing
    End Try

The important findings were:

  • Preceding LINQ queries have to be with implicite declaration (no Dim query as IEnumeration...) (thanks @djv)
  • Using .AsParallel inside LINQ query to use multiple cores
  • If lists are to be used, they should be assigned a size beforehand (.Capacity)
  • In some cases, For Each loops are a bit faster than LINQ queries

The optimised query can retrieve 27.500 values from 1.800.000 total in 890ms (previously it was 11.241ms)