openshift liveness probe to detect long running process

Question

I have a python data load service. One of the steps in the service is to refresh multiple Oracle materialized views. We have noticed that the service often gets stuck at this step and the issue gets fixed after a restart (pod). I want to configure a command based openshift liveness probe here. The purpose is to detect if the service is stuck at this step for say more than x hours, if yes then the probe fails and pod is restarted. The service doesn't have http access to it.

we use enormous logging in the script that is being run here. Is there a way to poll the openshift deployment log (latest one) and look for certain messages.

example:

#msg1
print("Refreshing materialized views") 
.
.
.
#msg2
print("materialized view refreshed")

msg1 marks the start of potentially problematic step. My intent to write a command that polls the log and looks for msg2 (as it marks completion, exit status 0), if it doesn't find msg2 for more than 5 hours say, it must return non zero exit status causing the probe to fail.

How can I implement this? Is this the best way to do it?

do you know the reason why the refresh of the materialize views is stuck ? How are you refreshing the MVs ?? — Roberto Hernandez, Sep 23 '21 at 06:36
DBMS_SNAPSHOT.REFRESH('VIEWNAME','C'); Not really sure of the reason why it gets stuck. Plus the issue is not 100% reproducible. It occurs intermittently and that is why we intend the service to detect and handle it. — Ketan_Gupta, Sep 23 '21 at 06:42
if you run the script in Oracle directly ( the refresh of all mviews ) , it works or it gest stuck as well ? — Roberto Hernandez, Sep 23 '21 at 06:44
It works. Since this is for data ingestion the refresh time varies as per the quantum of data ingested by service. We just want to add the probe that detects if this process takes longer than x hours and restart accordingly. — Ketan_Gupta, Sep 23 '21 at 06:54
I would not put the login on openshift. I would use a probe there to look for a message in a table, and implement the logic inside the process you are using for refreshing the MVs in Oracle. — Roberto Hernandez, Sep 23 '21 at 07:32

openshift liveness probe to detect long running process

0 Answers0