Context
I use and manage* a windows server at work. It is used for general computations by up to 5 users at a time via RDP. The server has 128GB of RAM which is sufficient for the work we do on it. However, now and then one of the processes eats up almost all memory due to a mistake in a script of the user (e.g. an erroneous array initialization, forgetting to remove a variable).
When that happens, all RDP connections are dropped and the server is uncontrollable until the memory usage is reduced or the server is restarted. The latter is a last resort as that leads to data loss for all users. I'm not sure what the exact memory threshold is before this "crash" happens, but it's somewhere in the 97% region.
Things that I've tried
Commands that do work
While the server is under heavy load I can still get a response from it with below commands
ping
: works normallytasklist /s servername
: returns data, but is very slow. It does allow me to find the offending PID and session ID.Enter-PSSession servername
: works, but only starts a session after a very long time
Commands that do not work
I've tried below commands to kill the offending process and regain control. Unfortunately none of them worked within 10-15 minutes.
taskkill /s servername /pid pid /f
: does nothing and stops after 10-15 minutes with a message about an internal errorpskill \\servername pid
: does nothing, stopped it manuallylogoff sessionID /server:servername
: does nothing, stopped it manually
Question
How can I kill the memory eating process quickly when the server is at ~97% of it's memory and does not respond to above commands?
*Corporate IT manages the server overall, but I manage periodic updates, user management, and software installations.