-2

Out from a complex scoring process I have a TDictionary structure:

target_results : TDictionary<longint, double>;

The key represents an id from a record in a MySQL table. From that id I can retrieve a filedate and a filename. I need to deliver these results ordered by one of these options:

1. dictionary value (solved: I'm doing this by assigning the dictionary to an array, sorting it and then retrieving the filename and date for each result, from the database)
2. filename
3. filedate

I'm thinking about using a TVirtualTable (from Devart) since I'm already using UniDAC in this project. Can someone advise a faster, sort flexible, more native approach into this?

Miguel E
  • 1,316
  • 2
  • 17
  • 39
  • If your dictionary value contained filedate and filename as well as score, it would be trivially solved. – David Heffernan Feb 13 '14 at 12:40
  • Since the final value in the dictionary is a result of many sub-totals, the filename, filedate filling would have to be made inside a post scoring iteration. Wouldn't the cost of this extra step be more expensive than retrieving directly into a virtual table? – Miguel E Feb 13 '14 at 12:55
  • I don't see how the calculation cost has bearing on the associated items. – David Heffernan Feb 13 '14 at 12:57
  • After scoring ends, I'll have to: 1.iterate dictionary, 2.for each entry pick and register filename and filedate, 3.assign dictionary to an array, 4.the sort the array. Using a virtual table I'll only execute steps 1,2 and 4. – Miguel E Feb 13 '14 at 13:01
  • We don't really have enough information. Why even bother with a dictionary? Don't you just have an array of id, filename, filedate, score? – David Heffernan Feb 13 '14 at 13:12
  • Dictionary is the best structure to handle this scoring process. To add a new sub-total into an entry I'd have to perform a search into an array. Using a hash table like TDictionary is much quicker. I've benchmarked it. – Miguel E Feb 13 '14 at 13:21
  • That's because you get bits of information in piecemeal fashion identified by the id, I presume. Unless you actually give us some concrete information, how can we advise you about performance? – David Heffernan Feb 13 '14 at 13:33
  • My question was about ways to sort a dictionary, not challenging the use of it. – Miguel E Feb 13 '14 at 13:53
  • Well, how can you sort a dictionary? They are unordered. I repeat what I said, you are asking how to get the best performance, but declining to provide the information that would make it possible to give the advice. – David Heffernan Feb 13 '14 at 13:55
  • The answer you accepted says the same as my comment that you rejected. I'm confused. – David Heffernan Feb 13 '14 at 17:13
  • I believe comments can't be accepted or rejected. @Runner has suggested using a TList after the TDictionary and then sort it. I don't think you got that far. – Miguel E Feb 13 '14 at 17:32
  • That's exactly what I meant. You said it would be too slow. – David Heffernan Feb 13 '14 at 17:40

1 Answers1

1

You cannot sort a dictionary. The only comparable structure with sorting build in is Judy array. However you can sort items that dictionary points to. You already seem to have sorted the keys if I understand you correctly. Now you do the same for other data if you want to sort by something different. The algorithm would be:

  1. Define a class or a record containing all data that is relevant for you
  2. Iterate or enumerate the TDictionary items into a generic TList and for each item fill the class or record with the data from the database
  3. Sort the items in the TList by the appropriate criteria. You can see an example of such sorting here: http://delphi.about.com/od/delphitips2009/qt/sort-generic.htm

Bear in mind that this will be O(N) for iterating where N is not only number of items in the database but number of buckets in the hash table. Then there is additional overhead of geting the data from the database for each item and finally there is O(NLogN) for quick sort.

TDictionary as all hash tables is meant for lookup, it is good at that and bad at other tasks like iteration or even sorting. If you want to speed that up use a separate list sorted by appropriate key so you can then just iterate already sorted list and get data from the DB. If sorting is really important and done a lot of times then use binary trees instead of hash tables. You can have one binary tree for each search field. With binary trees I mean balanced binary trees like AVL tree.

For instance binary trees are good for that because they stay sorted on insertions. Can't help you more without further data available.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
Runner
  • 6,073
  • 26
  • 38
  • Thanks. It's clear to me that a dictionary can't be sorted, but the benefits of using it in the scoring phase are high. What I was looking for were alternatives to make that sort after the dictionary was final, using another structure. It's already done to sort the keys by score, but the need to sort by other connected values raised the question. – Miguel E Feb 13 '14 at 14:27
  • As I told you you can get other values in the way that I described. You have to get the values from the database. If you have all the values in RAM then you do not need the database :). Also note that binary trees are not bad at searching at all. They have O(NLogN) and that may be acceptable. O(1) for hash tables beat that if you go for pure speed. – Runner Feb 13 '14 at 14:54