Issue: Writing a query to check all Kubernetes nodes to make sure coredns is running, and if not - has it been for more than 30 minutes since it was? - if so, send an alert.
The alert part will be secondary to my initial question and doesn't have to be addressed on this thread. I just want to figure out how to get this info in the first place.
Essentially: Hey node, do you have a pod named coredns.* running? If no, has it been more than 30m since you did?
My strategy: I assume searching for nodes that do not have a pod name of coredns.* is how I would start.
FROM K8sPodSample SELECT nodeName WHERE podName != 'coredns.*'
Then, set the time frame to be since 31 minutes ago. (Not sure if this shows nodes that have not had the pod on it for 31 minutes or if it shows all pods without it up to 31 minutes ago, even if it's only been a few minutes)
SINCE 31 minute ago
This is a query that will be at the cluster level, so I will add that in as well.
WHERE clusterName = '<clusterName>'
Then, if this worked properly, I'll generate an alert for any nodes that show up in this list.
Am I thinking about this properly, or could this be accomplished in a better way?
Update: My new strategy is to return a nodeName where the count of pods with coredns in their name is 0...still working this part out.