Progress-4GL - What is the fastest way to summarize tables? (Aggregate Functions: Count, Sum etc.) - OpenEdge 10.2A

Question

We are using OpenEdge 10.2A, and generating summary reports using progress procedures. We want to decrease the production time of the reports.

Since using Accumulate and Accum functions are not really faster than defining variables to get summarized values, and readibility of them is much worse, we don't really use them.

We have tested our data using SQL commands using ODBC connection and results are much faster than using procedures.

Let me give you an example. We run the below procedure:

DEFINE VARIABLE i AS INTEGER NO-UNDO.

ETIME(TRUE).
FOR EACH orderline FIELDS(ordernum) NO-LOCK:
    ASSIGN i = i + 1.
END.
MESSAGE "Count = " (i - 1) SKIP "Time = " ETIME VIEW-AS ALERT-BOX.

The result is:

Count= 330805
Time= 1891

When we run equivalent SQL query:

SELECT count(ordernum) from pub.orderline

The execution time is 141.

In short, when we compare two results; sql time is more than 13 times faster then procedure time.

This is just an example. We can do the same test with other aggregate functions and time ratio does not change much.

And my question has two parts;

1-) Is it possible to get aggregate values using procedures as fast as using sql queries?

2-) Is there any other method to get summarized values faster other than using real time SQL queries?

Tom Bascom · Accepted Answer · 2018-10-17T13:32:52.103

The 4gl and SQL engines use very different approaches to sending the data to the client. By default SQL is much faster. To get similar performance from the 4gl you need to adjust several parameters. I suggest:

-Mm 32600                  # messages size, default 1024, max 32600
-prefetchDelay             # don't send the first record immediately, instead bundle it
-prefetchFactor 100        # try to fill message 100%
-prefetchNumRecs 10000     # if possible pack up to 10,000 records per message, default 16

Prior to 11.6 changing -Mm requires BOTH the client and the server to be changed. Starting with 11.6 only the server needs to be changed.

You need at least OpenEdge 10.2b06 for the -prefetch* parameters.

Although there are caveats (among other things joins will not benefit) these parameters can potentially greatly improve the performance of "NO-LOCK queries". A simple:

FOR EACH table NO-LOCK:
  /* ... */
END.

can be greatly improved by the use of the parameters above.

Use of a FIELDS list can also help a lot because it reduces the amount of data and thus the number of messages that need to be sent. So if you only need some of the fields rather than the whole record you can code something like:

FOR EACH customer FIELDS ( name balance ) NO-LOCK:

or:

FOR EACH customer EXCEPT ( photo ) NO-LOCK:

You are already using FIELDS and your sample query is a simple NO-LOCK so it should benefit substantially from the suggested parameter settings.

score 1 · Answer 2 · answered Oct 09 '18 at 10:45

1

The issue at hand seems to be to "decrease the production time of the reports.".

This raises some questions:

How slow are the reports now and how fast do you want them?
Have running time increased compared to for instance last year?
Has the data amount also increased?
Has something changed? Servers, storage, clients, etc?

It will be impossible to answer your question without more information. Data access from ABL will most likely be fast enough if:

You have correct indexes (indices) set up in your database.
You have "good" queries.
You have enough system resources (memory, cpu, disk space, disk speed)
You have a database running with a decent setup (-spin, -B parameters etc).

The time it takes for a simple command like FOR EACH <table> NO-LOCK: or SELECT COUNT(something) FROM <somewhere> might not indicate how fast or slow your real super complicated query might run.

answered Oct 09 '18 at 10:45

Jensd

7,886
2
28
37

Our progam runs on more than 400 different servers and we have more than 900 different reports. There are customers having millions of records whereas some of our customers have only thousands of records on the same table. Some have 100 users, some have only 1 user. Some have high capacity hardware some have less. But it is the same in all cases for all summary queries we tested; progress procedures are slow compare to equivalent sql queries running on the same server. I can give complex queries as example in here as well but it is same: sql queries runs 5 times or more faster. – aza Oct 09 '18 at 14:04
Our progress startup parameters might not be right as you mentioned. But I guess the performance of sql queries also affected by it. Our startup parameters as follows: Maximum clients: 201 Maximum servers: 4 Maximum clients per server: 0 Lock table size: 1000000 entries Database buffers: 3000 (24000 kb) Excess shared memory: 60 kb APW queue check time:100 milliseconds APW scan time: 1 seconds APW buffers to scan: 5 APW max writes / scan: 25 Spinlock tries before timeout: 50000 Before-image buffers: 25 (200 kb) After-image buffers:25(200 kb) Max number of JTA transactions: 100 – aza Oct 09 '18 at 14:06
Often the reactions are the opposite: odbc is slow and ABL is fast so perhaps its an issue with database brokers. I doubt that anybody will be able to help without more information. – Jensd Oct 09 '18 at 14:10
Thanks for your quick reply. One of two main problems we have with progress is that it is slow and I'd be happy to give any information needed. Here is how we start our database: _mprosrv.exe c:\db\myDb -N TCP -H SERVER -S 12345 -E -n 200 -Mi 200 -L 1000000 -Mxs 60 -spin 50000 -bibufs 25 -minport 22000 -maxport 22500 ------- if there is any other information that can help to solve the problem I'm ready to give. – aza Oct 09 '18 at 14:23
Distance between client and server might be an issue. Also you should look into utilities like promon to check metrics like buffer hits. – Jensd Oct 09 '18 at 14:26
I'm running tests on the server not on the client. If odbc is slower in general compare to ABL than there must be something really wrong with our startup parameters, because for summary queries, all ABL queries are slower in our tests. – aza Oct 09 '18 at 14:36
Sounds like an issue somewhere but so hard to tell. Might be different queries on odbc/abl. Different login brokers with different parameters etc. Try contacting Progress Support. – Jensd Oct 09 '18 at 14:42
If your db is generally untuned or mis-tuned (and it sounds like it is) there are probably lots of improvements available. It is way off-topic for the root question but, for instance -Mi 200 is seriously wrong-headed and there are a whole lot of other parameters that seem to be missing from your provided list. – Tom Bascom Oct 13 '18 at 20:55
1

This sort of open-ended discussion would be more suited on a forum like https://community.progress.com/?Redirected=true – jdpjamesp Oct 17 '18 at 09:03
Thanks Tom, I have reconfigured startup parameters on some of my customers (50-100 users per server) as follows: -Mi to 5 -Ma 10 -Mn 20 -n 220 -B 150000. I'm waiting for feedbacks from these customers. Thanks again for your help. – aza Oct 19 '18 at 08:51

score 1 · Answer 3 · answered Oct 24 '18 at 12:09

1

Some additional suggestions:

It is possible to write your example as

DEFINE VARIABLE i AS INTEGER NO-UNDO.

ETIME(TRUE).
select count(*) into i from orderline.
MESSAGE "Count = " (i - 1) SKIP "Time = " ETIME VIEW-AS ALERT-BOX.

which should yield a moderate performance increase. (This is not using an ODBC connection. You can use a subset of SQL in plain 4GL procedures. It is debatable if this can be considered good style.)

There should be a significant performance increase by accessing the database through shared memory instead of TCP/IP, if you are running the code on the server (which you do) and you are not already doing so (which you didn't specify).

answered Oct 24 '18 at 12:09

idspispopd

404
3
10

When I start the db with parameters: -L 10000,-B 100000,-bibufs 25, -Mxs 60, -Ma 10, -spin 50000, -Mi 5, -Mn 20, -n 220; with TCP/IP connection; FOR EACH code runs in 1130 ms, and your code runs in 1207 ms. When I try to connect using shared memory it raises error message "Unable to attach shared memory, sharedmem.c.mydb.db.0, error 0 (1720)." If I decrease -B 100000 to -B 10000; Timings with TCP/IP connection are 1896 for FOR EACH and 2143 for your code. Timings with shared memory are 1356 for FOR EACH and 1269 with your code. – aza Oct 25 '18 at 10:39
Sorry to hear that the timing gets worse with select count(*) for TCP/IP' connection. I tested it before posting and it reduced the run time by about a third, but that was using shared memory. – idspispopd Oct 26 '18 at 07:23
Decreasing -B is probably not worth it. You could try to decrease -B but not as much. -B 100000 should be achievable though, you might need to change other parameters. See https://knowledgebase.progress.com/articles/Article/P129034 for details. The more elegant solution would be to switch to 64-bit Progress, but that would probably require additional licenses. – idspispopd Oct 26 '18 at 07:29
I think switching to 64 bit and a newer version of openedge will handle much of my problems but it is not an option right now. Thanks for your help. I'll check the article. – aza Oct 27 '18 at 11:07

score 0 · Answer 4 · answered May 13 '21 at 19:50

0

open query q preselect each EACH orderline no-lock. message num-results("q") view-as alert-box.

answered May 13 '21 at 19:50

José Francisco Peña Ordaz

1

Progress-4GL - What is the fastest way to summarize tables? (Aggregate Functions: Count, Sum etc.) - OpenEdge 10.2A

4 Answers4