3

I have written a Go-based K8s client application to connect with the K8s cluster. To handle the realtime notification from the K8s cluster (add, delete, update) of Pod, Namespace, and Node, I have programmed an informer. The code snippet is below.

I want to bring specific attention to the "runtime.HandleCrash()" function, which (I guess) helps to redirect the runtime panic/errors to the panic file.

// Read the ES config.
panicFile, _ := os.OpenFile("/var/log/panicfile", os.O_WRONLY|os.O_CREATE|os.O_SYNC, 0644)
syscall.Dup2(int(panicFile.Fd()), int(os.Stderr.Fd()))

See some errors below which are reported/collected in the panic file.

My question is: What is the way, I can program informer that it reports/notifies the specific errors to my application rather than writing to a panic file? That way, my application would be able to handle this - expected event - more gracefully.

Is there any way I can register a callback function (similar to Informer.AddEventHandler()).

func (kcv *K8sWorker) armK8sPodListeners() error {

    // Kubernetes serves an utility to handle API crashes
    defer runtime.HandleCrash()

    var sharedInformer = informers.NewSharedInformerFactory(kcv.kubeClient.K8sClient, 0)

    // Add watcher for the Pod.
    kcv.podInformer = sharedInformer.Core().V1().Pods().Informer()
    kcv.podInformerChan = make(chan struct{})

    // Pod informer state change handler
    kcv.podInformer.AddEventHandler(cache.ResourceEventHandlerFuncs {
        // When a new pod gets created
        AddFunc: func(obj interface{}) {
            kcv.handleAddPod(obj)
        },
        // When a pod gets updated
        UpdateFunc: func(oldObj interface{}, newObj interface{}) {
           kcv.handleUpdatePod(oldObj, newObj)
        },
        // When a pod gets deleted
        DeleteFunc: func(obj interface{}) {
            kcv.handleDeletePod(obj)
        },
    })

    kcv.nsInformer = sharedInformer.Core().V1().Namespaces().Informer()
    kcv.nsInformerChan = make(chan struct{})

    // Namespace informer state change handler
    kcv.nsInformer.AddEventHandler(cache.ResourceEventHandlerFuncs {
        // When a new namespace gets created
        AddFunc: func(obj interface{}) {
            kcv.handleAddNamespace(obj)
        },
        // When a namespace gets updated
        //UpdateFunc: func(oldObj interface{}, newObj interface{}) {
        //    kcv.handleUpdateNamespace(oldObj, newObj)
        //},
        // When a namespace gets deleted
        DeleteFunc: func(obj interface{}) {
            kcv.handleDeleteNamespace(obj)
        },
    })

    // Add watcher for the Node.
    kcv.nodeInformer = sharedInformer.Core().V1().Nodes().Informer()
    kcv.nodeInformerChan = make(chan struct{})

    // Node informer state change handler
    kcv.nodeInformer.AddEventHandler(cache.ResourceEventHandlerFuncs {
        // When a new node gets created
        AddFunc: func(obj interface{}) {
            kcv.handleAddNode(obj)
        },
        // When a node gets updated
        UpdateFunc: func(oldObj interface{}, newObj interface{}) {
           kcv.handleUpdateNode(oldObj, newObj)
        },
        // When a node gets deleted
        DeleteFunc: func(obj interface{}) {
           kcv.handleDeleteNode(obj)
        },
    })

    // Start the shared informer.
    kcv.sharedInformerChan = make(chan struct{})
    sharedInformer.Start(kcv.sharedInformerChan)
    log.Debug("Shared informer started")

    return nil
}

In a specific use case, I shutdown the K8s cluster resulting in an informer throwing error messages into a panic file as below.

The moment I boot up the K8s cluster nodes, it stops reporting these errors.

==== output from "/var/log/panicfile" ====== 

E0611 16:13:03.558214      10 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Pod: Get https://10.30.8.75:6443/api/v1/pods?limit=500&resourceVersion=0: dial tcp 10.30.8.75:6443: connect: no route to host                                                                                                                                     
E0611 16:13:03.558224      10 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Namespace: Get https://10.30.8.75:6443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.30.8.75:6443: connect: no route to host                                                                                                                         
E0611 16:13:03.558246      10 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Node: Get https://10.30.8.75:6443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.30.8.75:6443: connect: no route to host                                                                                                                                   
mario
  • 9,858
  • 1
  • 26
  • 42
AnilJ
  • 1,951
  • 2
  • 33
  • 60

1 Answers1

1

Your question is:

Is there any way I can register a callback function (similar to Informer.AddEventHandler()).

A believe what you are looking for is SetWatchErrorHandler()

From the source code:

type SharedInformer interface {
    ...

    // The WatchErrorHandler is called whenever ListAndWatch drops the
    // connection with an error. After calling this handler, the informer
    // will backoff and retry.
    //
    // The default implementation looks at the error type and tries to log
    // the error message at an appropriate level.
    //
    // There's only one handler, so if you call this multiple times, last one
    // wins; calling after the informer has been started returns an error.
    //
    // The handler is intended for visibility, not to e.g. pause the consumers.
    // The handler should return quickly - any expensive processing should be
    // offloaded.
    SetWatchErrorHandler(handler WatchErrorHandler) error
}

You call this function on informer:

kcv.podInformer.SetWatchErrorHandler(func(r *Reflector, err error) {
    // your code goes here
})

Here is the DefaultWatchErrorHandler.

Wytrzymały Wiktor
  • 11,492
  • 5
  • 29
  • 37
Matt
  • 7,419
  • 1
  • 11
  • 22