0

I am writing a search mechanism using Rx in C#. I have a dataset of 50k records which needs to run for search of keyword when user starts typing. I have created below code for that but I believe there are lots of issue related to concurrency and paralleizing. Please review the code and let me know how it can be optimized for multi-core systems.

Code:

 public class MainPageViewModel : Bindablebase
    {
        private string searchString;

        public string SearchString
        {
            get { return searchString; }
            set { SetProperty(ref searchString, value); }
        }

        private string result = "no result";

        public string Result
        {
            get { return result; }
            set { SetProperty(ref result, value); }
        }

        private ObservableCollection<string> lstItems = new ObservableCollection<string>();

        public ObservableCollection<string> LstItems
        {
            get { return lstItems; }
            set { SetProperty(ref lstItems, value); }
        }

        CoreDispatcher dispatcher = CoreWindow.GetForCurrentThread().Dispatcher;

        private List<int> dataSet1 = new List<int>();

        public MainPageViewModel()
        {
            PopulateSampleDatas();

            // Get stream of input character
            var searchDataStream = this.ToObservable<string>(() => SearchString)
                                       .Throttle(TimeSpan.FromMilliseconds(400));


            // Add Data and Search Mechanism
            var resultStream = searchDataStream
                               // Move to UI thread and clear all the result list for new search keyword result
                               .ObserveOnDispatcher()
                               .Do(str => { LstItems.Clear(); LstItems.Add(SearchString); })
                               // Move to seperate thread for creating bunch of smaller datasets out of large one
                               .ObserveOn(TaskPoolScheduler.Default)
                               .Select(GetFilteredData)
                               // For every new keyword type ignore the previous buffer data and switch to new one
                               .Switch()
                               // run filter operation on those bunch of data in parallel
                               .SelectMany(FilterData);


            // subscribe to search setream
            resultStream.ObserveOnDispatcher().Subscribe(v =>
            {
                foreach (var val in v)
                {
                    LstItems.Add(val.ToString());
                }
            });

        }

        /// <summary>
        /// Filters the data.
        /// </summary>
        /// <param name="arg">The argument.</param>
        /// <returns>Task&lt;List&lt;System.Int32&gt;&gt;.</returns>
        private async Task<List<int>> FilterData(IList<int> arg)
        {
            List<int> result = new List<int>();

            // process the filtering mechism on bunch of datasets
            result = Filtereddata(arg);

            return result;
        }

        private IObservable<IList<int>> GetFilteredData(string arg)
        {
            // create smaller sets of data out of large set to run filter mechism in parallel
            return dataSet1.ToObservable().Buffer(100)
                .ObserveOn(TaskPoolScheduler.Default);
        }

        /// <summary>
        /// Populates the sample datas.
        /// </summary>
        private void PopulateSampleDatas()
        {
            // populate the sample data set
            for (int i = 0; i < 50000; i++)
            {
                dataSet1.Add(i);
            }
        }
    }
ydoow
  • 2,969
  • 4
  • 24
  • 40
Balraj Singh
  • 3,381
  • 6
  • 47
  • 82
  • 2
    50K is a tiny amount of records and Rx is about processing events, not rows. Why don't you use PLINQ or a database? Rx can tell you when the user stops hitting keys but it *doesn't* have to be used for searching as well – Panagiotis Kanavos May 24 '16 at 09:23
  • Actually i am creating a mobile application in which this which work as offline search where the entire data is present in memory and the search needs to be real time as user types each character. Hence i prefer Rx. Now on your comment PLINQ can also be used . Can you suggest the best possible way to use a perfect blend of both. – Balraj Singh May 24 '16 at 09:40
  • Seems overly complex. However, to help, I would suggest only adding concurrency in one place, next the the finally `Subscribe(` method. i.e `SubscribeOn(..backgroundthread..).ObserveOn(..dispatcher..).Subscribe(..)` – Lee Campbell May 24 '16 at 09:40
  • @LeeCampbell Actually I am trying to clear the list which holds result of previous search result and start a new search as and when new character is typed. Plus I am trying to search by dividing a large dataset into smaller chunks of data. Can you suggest a better way to achieve this? – Balraj Singh May 24 '16 at 09:45
  • 3
    Two seperate problems. Rx seems like it could be useful for clearing the list and starting a new search. Dividing the large dataset sounds like a job for TPL/PLINQ or smarter algorithms. On a recent project we were able to get live filtering on a Combobox of 30,000 rows in under 70ms so we didn't even leave the UI thread. No Rx, no PLINQ, just a pre-populated Trie. – Lee Campbell May 24 '16 at 14:00
  • 2
    Have a look at the original Hands On Lab (HOL) that the Rx team published. I think it has a search example if you do find that your query takes longer than 50ms (even for only 50,000 rows) - https://social.msdn.microsoft.com/Forums/en-US/06d7636d-fca6-4fbd-bdcd-7c01b2283b91/rx-handson-labs-published?forum=rx – Lee Campbell May 24 '16 at 14:03

0 Answers0