3

I want to get the cpu usage during testing for my ML model (during a call to predict). Here is currently what I am doing. p is the current process:

start = p.cpu_percent(interval=1)
y_hat = clf.predict(X_test)
print(abs(p.cpu_percent(interval=None) - start)) # prints cpu usage (%)

Is this the correct approach or is there a better way to achieve this?

Yurroffian
  • 61
  • 2
  • 6
  • I like to run `htop` or `top` in a terminal while my model is running to monitor usage. – jkr Dec 17 '20 at 21:03
  • I looked into that as well, what is the command to do that for a specific function call (testing on a dataset)? Also, shouldn't that accomplish the same as the above using psutil? – Yurroffian Dec 17 '20 at 21:32
  • I'm not sure about the specific command, but you can look for running `python` executable(s). Your example is not equivalent because you are taking two snapshots of cpu percent (before and after) but you do not get cpu percentage _during_ prediction. One possible solution is run `cpu_percent()` periodically in a separate thread. – jkr Dec 17 '20 at 21:51
  • I edited the code in my question which I believe captures the cpu usage during the call to predict? – Yurroffian Dec 18 '20 at 15:52
  • 1
    If your cpu_percent function returns the total CPU usage of all processes, subtracting the original usage before running from the usage after running (what you're doing) should work. – TechPerson Dec 18 '20 at 16:46

2 Answers2

1
  1. Using psutil is suggested to use psutil.virtual_memory().

    import psutil
    mem = psutil.virtual_memory()
    print(mem)
    svmem(total=10367352832, available=6472179712, percent=37.6, used=8186245120, free=2181107712, active=4748992512, inactive=2758115328, buffers=790724608, cached=3500347392, shared=787554304, slab=199348224)
    
    THRESHOLD = 100 * 1024 * 1024  # 100MB
    if mem.available <= THRESHOLD:
         print("warning")
    
    # you can convert that object to a dictionary 
    dict(psutil.virtual_memory()._asdict())
    

  1. psutil.cpu_times(percpu=False) Return system CPU times as a named tuple.

     import psutil
     print(psutil.cpu_times())
     #scputimes(user=17411.7, nice=77.99, system=3797.02, idle=51266.57, iowait=732.58,      irq=0.01, softirq=142.43, steal=0.0, guest=0.0, guest_nice=0.0)
    

  1. os.times() Returns the current global process times. The return value is an object with five attributes

    import os 
    
     curr_gp_times = os.times()      
     print(curr_gp_times) 
     # posix.times_result(user=0.03, system=0.01, children_user=0.0, children_system=0.0, elapsed=17370844.95)
    

EDIT This may be closer to what you are looking for:

  1. psutil.cpu_times(percpu=False) Return system CPU times as a named tuple. Every attribute represents the seconds the CPU has spent in the given mode.

    • user: time spent by normal processes executing in user mode; on Linux this also includes guest time

    • system: time spent by processes executing in kernel mode

    • idle: time spent doing nothing

        import psutil
        psutil.cpu_times()
        # scputimes(user=17411.7, nice=77.99, system=3797.02, idle=51266.57, iowait=732.58, irq=0.01, softirq=142.43, steal=0.0, guest=0.0, guest_nice=0.0)
      
Federico Baù
  • 6,013
  • 5
  • 30
  • 38
  • Thanks for the response! Isn't virtual_memory().percent associated with ram usage? – Yurroffian Dec 18 '20 at 16:37
  • It appears to be from his link. 2 and 3 should get the CPU usage though, just keep in mind they return a time instead of a percentage. – TechPerson Dec 18 '20 at 16:48
  • Yurrofian, it returns usage in bytes as stated in the documentation, there are actually some caveat, depending on the OS it suggests to check this Github file: https://github.com/giampaolo/psutil/blob/master/scripts/meminfo.py MEMORY ------ Total : 9.7G Available : 4.9G Percent : 49.0 Used : 8.2G Free : 1.4G Active : 5.6G Inactive : 2.1G Buffers : 341.2M Cached : 3.2 – Federico Baù Dec 18 '20 at 19:35
  • I actually added a 4th possibility – Federico Baù Dec 18 '20 at 19:40
1

Assuming you want to do this within your program with a builtin, Python's resource module might be of use to you here. psutil is a good option (as suggested by Federico) if you're able to install packages.

Outside your program, there are many ways to get CPU usage of an arbitrary process. If you're on *nux and prefer the command line, top and similar commands should do the job. On a graphical interface, I personally prefer KSysGuard (I'm on Kubuntu). Gnome System Monitor works as well. On Windows, the Task Manager should suffice.

EDIT: psutil seems to return global usages. If you only wanted the usage of your process you'd be better off with resource or os.times, but if you want total CPU utilization (including other processes) psutil is a more robust solution.

For CPU times in resource:

import resource

resource.getrusage()[0]  # Returns the time in seconds in user mode
# Note that this time accumulates while the program runs,
# so you might want to save its previous value each time you take a measurement
TechPerson
  • 320
  • 1
  • 8