(This won't fit in a comment/benefits from formatting)
@Ashis @Vit is correct, but I'll add some clarification around what finalizers are for and why they sometimes need to be patched.
Also in terms of "It's not the good practice to remove finalizers" It's more accurate to say as a general rule of thumb you shouldn't do it. But it's important to know why you shouldn't do it, as that allows you to understand the exception to the rule / when it's fine to go against the rule of thumb.
Note: This is such a case, here's why
- Both of the following are true rule of thumb statements:
- Normally you shouldn't have to touch finalizers.
- Normally resources shouldn't be stuck in a deletion state for over 1 hour / never transition to deleted even if you wait 24 hours.
- When you're not in a normal circumstance, it's fine to go against the rule of thumb.
What's the purpose of finalizers?
https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/
- Finalizers are responsible for garbage collection / auto deletion of auto provisioned resources.
- If creating a service.yaml can spawn a CSP LB (cloud service provider load balancer)
- Then deleting a service.yaml should delete the CSP LB that was spawned, to prevent orphaned resources from existing/costing money.
Why can a resource get stuck in an undeletable state?
- Finalizers are the implementation detail that helps garbage collection of those resources, but the finalizer logic can and will occasionally get stuck in an inconsistent state that requires resolution via manual intervention.
- https://github.com/kubernetes/kubernetes/issues/39420#issuecomment-546781470
mentions there's a few race condition type bugs that can leave a service stuck in deleting. / put service controller in a confused state. (like a delete and update operation occur at same time) or delete can't succeed because already deleted. And a finalizer never gets updated/removed due to stuck inconsistent state due to race condition bug.
- These race condition bugs (that are inconsistently reproducible, semi-rare, yet somewhat common) occur across OpenShift, GKE, Azure, EKS, and DIY Kubernetes. They can happen with Ingress (mostly on GKE), Services, Namespaces, pvcs, and other objects.
- BTW this trick to fix the inconsistent state of the finalizer even shows up in official CSP docs, like GCP, and Googlers invented Kubernetes.
https://cloud.google.com/kubernetes-engine/docs/troubleshooting#namespace_stuck_in_terminating_state
How to fix:
Read https://www.middlewareinventory.com/blog/kubectl-delete-stuck-what-to-do/
Here's a summary of what it says
- Start by following the rule of thumb of not deleting a finalier/give time for garbage collection, but after waiting 1 hour consider forceful measures.
- If you're really curious check logs of kube-controller-manager (or other controller like in the case of aws-load-balancer-controller (kube add-on for clusters running on AWS), or a storage class controller add-on (for on premises kubernetes that use DIY storage classes). (These might tell you why the finalizer is stuck.)
- If you understand the external resource that would have been created / that the finalizer is intended to clean up, and you're willing to check if it already got deleted or manually clean it up, then you should be safe to patch the finalizer to null (clearing the inconsistent state), and then delete the object.
- To fix resources stuck in terminating longer than 1h:
kubectl patch (service|ingress|pvc|etc) (name-of-resource) -p '{"metadata":{"finalizers":[]}}' --type=merge
- To fix namespaces stuck in terminating logner than 1h:
- The following works with linux/mac/Win11 WSL2's bash, but assumes jq (json query) is installed
export NS=istio
kubectl get ns $NS -o json | jq '.spec.finalizers=[]' | kubectl replace --raw /api/v1/namespaces/$NS/finalize -f -