10

I have generated logs for my pods using kubectl logs 'pod name. But I want to persist these logs in a volume (some kind of persistent storage), because container logs will get wiped out if the pods go down. Is there a way to do this? Do I have to write some sort of a script? I have read many answers but I still do not understand how to go about it, any help is appreciated. Thanks!

Saranya Gupta
  • 1,945
  • 2
  • 10
  • 14
  • What do you want to achieve. What are you trying to do with the stored logs? You should try to implement centralised logging solutions like ELK or with similar tools. With them in place you can get hold of logs realtime and storing it for further analysis. – Rohit Aug 19 '20 at 04:42
  • @Rohit I will be using these logs to do fault injection. Can you elaborate a bit more on how to use centralized logging solutions like ELK. Thanks! – Saranya Gupta Aug 19 '20 at 04:45
  • There are lot of documentation on line. Below is way to setup similar solution within kubernetes cluster https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes. I am however using a different solution installed outside the kubernetes cluster using Graylog + Elastic Search and I installed filebeat Daemonset on kubernetes which forwards all logs to outside cluster for storage and analysis. There are many ways and solutions readily available online. – Rohit Aug 19 '20 at 05:26

3 Answers3

7

I know this is an old question, but I've just had the same problem and I've spent some time to figure out the solution, so I'd like to share a more detailed solution.

Like Aayush Mall said, you'll need the PersistentVolume and PersistentVolumeClaim objects to create the volume and then link it to the pod (preferably via a Deployment object).

Basically, The PersistentVolume would define how and where the volume would be stored in the host and the PersistentVolumeClaim would define the constraints to bind the volume to some container.

From the docs:

A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.

A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).

So, let's say your pods are running in two nodes: mynode-1 and mynode-2.

Your PersistentVolume spec will look like this.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: myapp-log-pv
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /var/log/myapp
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - mynode-1
          - mynode-2

Your PersistentVolumeClaim like this.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: myapp-log-pvc
spec:
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  storageClassName: local-storage
  resources:
    requests:
      storage: 2Gi
  volumeName: myapp-log

And then, you just have to tell the deployment object how to mount the volume inside the container. So, your Deployment spec will look like this.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deploy
spec:
  selector:
    matchLabels:
      app: myapp
  replicas: 1
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myrepo/myapp:latest
        volumeMounts:
          - name: log
            mountPath: /var/log
      volumes:
      - name: log
        persistentVolumeClaim:
          claimName: myapp-log-pvc
      

And that's it. When your deployment starts, it'll create the pod with the container, mount a volume named log for the path /var/log (inside the container) and bound this volume to some PV matching the requirements of the PVC named myapp-log-pvc. As we've created the myapp-log-pv with the same volumeMode, accessModes and storageClassName fields and with more storage capacity then the required by myapp-log-pvc, they will be bound. So, your app logs will be stored in the path /var/log/myapp (field spec.local.path in the myapp-log-pv spec) inside the node running the pod.

I hope it help :)

Also, I'm kinda new in the kubernetes world, so please let me know if you notice I misunderstood something or if there is a better way to do this.

Victor Coll
  • 119
  • 1
  • 3
  • How do I modify my PV if new node is added? – Ankit Bansal May 31 '22 at 12:10
  • I don't know exacly why, but PVs can't be modified after creation. So you have to delete the current one and create a new in this case. – Victor Coll Jun 21 '22 at 05:08
  • 1
    Let's say there are 2 pods created by `myapp-deploy`, running on the same host `mynode-1`. Is there chaos when they write to the same log in the same folder`/var/log/myapp/myapp.log` at the same time? – Bryan Chen Dec 13 '22 at 06:03
6

Under Logging Architecture Kubernetes documents goes thru couple of way to set up loggin in your cluster.

The most interesting for you might be Cluster-level logging architecture:

While Kubernetes does not provide a native solution for cluster-level logging, there are several common approaches you can consider. Here are some options:

  • Use a node-level logging agent that runs on every node.
  • Include a dedicated sidecar container for logging in an application pod.
  • Push logs directly to a backend from within an application

There are many solutions for collecting pod logs and shipping them to a centralized location such as:

Keeping logs outside of cluster has benefits. If you cluster begins to have issues its more likely that your inside logging architecure will also face them.

acid_fuji
  • 6,287
  • 7
  • 22
5

You will need to mount the logs directory inside the container to the host machine as well, using the PersistentVolume and PersistentVolumeClaim.

This way you can persist these logs even if the container is killed.

Create the PersistentVolume and PersistentVolumeClaim for the log path and use them as volume mounts to the kubernetes deployments or pods.

Aayush Mall
  • 963
  • 8
  • 20