Prevent Machine Learning Services (R) on SQL Server taking up high levels of CPU

Question

I have had Machine Learning Service installed on SQL Server to be able to run R scripts (called from Stored Procedures).

There are now several of these Stored Procedures as part of a script to populate date into Power BI. Because we run these stored procedures from Power BI Service refresh, it appears that they are running in parallel and consuming large amounts of CPU whilst they run. This is slowing down the server for other (more critical) processes. In Task Manager there are often as many as 10 instances of RTerm.exe running at the same time when the refresh is invoked.

When a single one of these stored procedures is run, sometimes one instance, but often multiple instances of RTerm.exe start running on the server.

Is there a way to limit ML Services / R so that it has access to only a proportion of the CPUs or resources available and we can ensure that it does not consume more resources than a prescribed maximum? Are there any other strategies that people have deployed to resolve this issue?

Yes. It is documented here: https://learn.microsoft.com/en-us/sql/advanced-analytics/administration/resource-governor?view=sql-server-ver15 — Conor Cunningham MSFT, Dec 05 '19 at 01:39
Thanks for that Conor - found some other useful documents here: https://learn.microsoft.com/en-us/sql/relational-databases/resource-governor/resource-governor-resource-pool?view=sql-server-ver15 https://learn.microsoft.com/en-us/sql/relational-databases/resource-governor/create-a-resource-pool?view=sql-server-ver15 What I am interested to learn is the sort of strategies people have applied to the MIN / MAX CPU settings across the pools to achieve desired results. In my scenario I want to say "when it is available, let R use 80% of available CPU, but cap when other processes need it" — Brisbane Pom, Dec 11 '19 at 23:08
It depends heavily on what kind of application you want to run for your main workload and what resources it requires. Ml operations are not cheap but usually you would prioritize the daily runtime (scoring and transactional) vs training if running on the same system. You would use the transactional workload during the training period (off hours) to guide what to select. Alternatively you could train on another system and then score and do transactional on the main system. Sql is a Turing complete language so you see all sorts of app patterns. — Conor Cunningham MSFT, Dec 28 '19 at 02:24

score 0 · Answer 1 · answered Oct 21 '22 at 20:39

The server manages a pool of RTerm.exe instances in order to handle multiple requests, and the default is to increase that pool by 5 extra instances at a time when more are needed (leading to 6 spinning up at once when executing the first external script).

This can be changed by setting a value for PROCESS_POOL_SQLSATELLITE_GROWTH in rlauncher.config in the MSSQL/Binn folder and then restarting the SQL Server launchpad service in SQL Server Configuration Manager.

For my case, we had scripts failing due to memory allocation issues. Turned out those 5 extra instances were eating up most of the default memory budget (20% of physical memory) by themselves, not leaving enough for relatively simple scripts to complete. Setting PROCESS_POOL_SQLSATELLITE_GROWTH=1 has resolved the issue.

Found this information from this source: https://nielsberglund.com/2017/04/23/microsoft-sql-server-r-services---internals-iv/

Prevent Machine Learning Services (R) on SQL Server taking up high levels of CPU

1 Answers1