6

i have problem retrieving data with Delphi TClientDataSet

Code with ADO:

ADOQuery1.SQL.Text:='SELECT * FROM Table1 WITH (NoLock)';
DataSource1.DataSet:=ADOQuery1;
DataSource1.DataSet.Open;
DataSource1.DataSet.Last;

Code above returns over 180k rows in 3-6 seconds when using pure ADO.

Same code with TClientDataSet:

ADOQuery1.SQL.Text:='SELECT * FROM Table1 WITH (NoLock)';
CDS1.SetProvider(ADOQuery1);
DataSource1.DataSet:=CDS1;
DataSource1.DataSet.Open;
DataSource1.DataSet.Last;

Following code returns same amount of rows(over 180k) but within 3-4minutes.

What's wrong with CDS? It's about 100-times slower then using ADO. Is it possible to fix it?

mjn
  • 36,362
  • 28
  • 176
  • 378
Valeriy
  • 131
  • 2
  • 5
  • 5
    I don't know the answer, but returning 180k(!) records to the client is a very bad design in any case... – kobik Jan 15 '15 at 09:40
  • 4
    What Delphi version? There was a huge slowdown from XE2 to XE3 (so much that we reversed our Delphi update), I don't know if it has been fixed since. – Jan Doggen Jan 15 '15 at 09:40
  • @kobik The name of the component (TClientDataSet) does not imply that the client is a human being. The application where this component is used could be any kind of program, and some programs must be able process a huge dataset in memory. Unless the documentation explicitly says "do *not* use this component for more than X records". – mjn Jan 15 '15 at 10:04
  • I wouldn't be 100% sure that ADO is returning ***all*** the rows. Depending on settings such as `CursorLocation`, you may be getting only the first few and last few rows. There are also other settings that that can give ADO an opprtunity to implement certain efficiency improvements. E.g. If configured to read "forwards-only", ADO would not need to store or process any intermediate rows when you execute `DataSet.Last`. – Disillusioned Jan 15 '15 at 10:17
  • 3
    That said: Pulling all data into a ClientDataSet requires each rows to be converted into an internal CDS format. This format provides "enrichment features" such as: bi-directional navigation, change-caching, local indexing. All this requires extra processing and memory allocation for the supporting internal structures. _(If you need these features, then you need the extra processing. If you don't need them, then `TClientDataSet` is probably not the best tool for your requirements.)_ – Disillusioned Jan 15 '15 at 10:22
  • 2
    Just trying to exclude the obvious but you don't have any data-aware component attached to the DataSource do you? Have you tried Disable-/EnableControls? – Lieven Keersmaekers Jan 15 '15 at 10:22
  • Disabling controls has no effect. Microsoft Management Studio executes same command within 3-4 seconds. – Valeriy Jan 15 '15 at 10:37
  • 1
    Disabling controls is a must. And I agree with most comments here. For fetching 180k rows to client side you need to have serious reasons. And if you are sure you need to fetch so many tuples, then choose a different storage. `TDataSet` descendants *are slow* in general. For fetching optimize the query to use a forward, read only cursor and access the underlying `Recordset` interface. @mjn, it's not documented as well as it's not documented how many items is reasonable to use e.g. with `TList`. – TLama Jan 15 '15 at 12:17
  • Query period could be for the last 10 years. So numbers of rows returned could be even bigger. Anyway, main question is how to avoid huge waste of time. – Valeriy Jan 15 '15 at 12:53
  • what are you suppose to do with the result set of 108k records? do you need to display them? import them? analyze them? maybe all you need is a correct SQL statement/SP to get what you want. it only makes sense to fetch all records if you need to import all the result set, and in that case I would not use CDS at all. @mjn, I was referring to "client" side application. – kobik Jan 15 '15 at 13:04
  • The first version of the code will simply navigate the underlying ADO recordset (ie executes inside ADO) whereas the second version is transferring the whole record set so the overhead of the ADO TDataset wrapper is being applied as well as the overhead of the TClientDataset. It may not be relevant but the Delphi ADO wrapper has major performance improvement from calling DisableControls even for TDatasets with no controls linked to them. This may make a difference but it will still be slow. – Kanitatlan Jan 15 '15 at 13:47
  • I had a similar problem which I then discovered that a round trip was being made to database server for each record for blob data. Now FetchOnDemand on CDS is true and poFetchBlobsOnDemand on provider is also true and controls are disabled at open time, but I don't remember further details. – Sertac Akyuz Jan 15 '15 at 18:18
  • @TLama: I'd rather say that while some specific dataset descendants are slow (due to poor internal design/implementation) other (such as say kbmMT or my own CDS implementation) are quite fast and have no problems loading/handling hundreds of thousands or even millions of records. for example users of my apps sometimes load up to 500k records (100+ fields including many string/memo ones) in CDS and mostly restricted only with available address space (32-bit mode) – Vladimir Ulchenko Jan 16 '15 at 06:40

4 Answers4

1

Code above returns over 180k rows in 3-6 seconds when using pure ADO.

For some reasons, I wouldn't expect the code you posted to return every single of the 180k records... I'd expect to see the first "X" records loaded after TADOQuery.Open is called, and then the last "X" records sent when TADOQuery.Last is called. Going while not EoF do instead of ".Last" would probably be better performance test since (I assume) you actually want to browse all the records.

When calling TClientDataset.Last when linked to a DataProvider, it most likely do the equivalent of a while not EoF do on your query, which transfer all the 180k records. Also, TClientDataset Insert/Append operation tend to get slower and slower the more records you have in it. My best guess is that it has to realloc it's memory buffer every now and then. If that's the case, I haven't found a way to tell the TClientDataset: "Hey! Brace yourself, 180k records incoming!"(Akin to TList.SetCapacity).

If you have an older version of delphi, one thing that could help is the Midas Speed Fix.

Ken Bourassa
  • 6,363
  • 1
  • 19
  • 28
  • indeed standard TDataPacketWriter.WriteDataSet not only loops over the source dataset till either eof hit or requested amount of records fetched, but also writes every other field value (into resulting stream) within inner fields loop wrapped in empty try/except block – Vladimir Ulchenko Jan 16 '15 at 06:33
0

This is pretty old but there are a lot of new Delphi programmers out there now. Here is a bit of a scoop.

When using CDS in delphi you are in fact creating a memory table. Your query probably went to swap hell.

To get the most out of a CDS use DBX components to grab data. They are so called 'fast forward' cursors that dont create a temp table with a cursor in the database. forward only doesnt do the fancy stuff that ADO does. If you need massive datasets with total update notifications and full control, then use ADO. If you need to plow through a ton of data in a hurry with little load on the server that is where CDS/DBX shines.

The way of DBX is harder. It is just a dragster. First and Next are the only thing that works for them. No updates, no commitments, just a fast one way relationship. Hook up a DBX/provider/CDS combo and you have it all. Speed and edit ability. Use a version number to detect another user doing something to the data while you were editing. Study the provider options to see how to get power with flexibility. This is pretty much as hard core as it gets in Delphi.

Michael
  • 587
  • 3
  • 14
  • Sorry, the question was why is a CDS so slow with that number of records compared with an Ado dataset. The correct answer is that it always has been slow for more than a few 10k of records. Originally it was because of the way it allocated memory, but even though that has been improved over the years, it was designed for small, client-side datasets, and 180k rows is not "small" in that context. Btw a TAdoQuery feeding a CDS via a TDatasetProvider is way faster than a DBX TSqlQuery retrieving the same data. – MartynA Feb 14 '16 at 18:49
  • And, properly used, a TAdoQuery can do most of the nifty things a CDS can (like its "briefcase" model), without (unlike DBX) requiring a CDS to to the client-side stuff, and it will be much quicker at it, too, ime. – MartynA Feb 14 '16 at 18:54
0

CDS1.LogChanges property = false or true its the same problem

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 09 '22 at 04:56
-1

Try setting the CDS1.LogChanges property to False before loading the data. This needs to be done in code as it is not a published property.

From the helpfile: For large datasets, a value of True for LogChanges may severely impact the performance of the application.

You can then turn it on after the initial load.

Larsdk
  • 705
  • 6
  • 10