If you are getting the "No space left on device
" error even when the disk usage and inode usage are low, it might be that the disk resources for your specific pod are limited. The Kubernetes system can set limits on resources like CPU, memory, and disk storage.
So start with checking the Kubernetes resource limits and requests for your pod: run kubectl describe pod <my-pod>
to check if there are resource limits or requests set. Look for something like:
Resources:
Limits:
ephemeral-storage: 1Gi
Requests:
ephemeral-storage: 500Mi
The ephemeral-storage
represents the storage available for your pod to use. If it is set too low, you might need to adjust it.
Try also to set said resource requests and limits yourself: You can specify the resources available for your pod by adding the following in your pod or deployment configuration with:
resources:
requests:
ephemeral-storage: "1Gi"
limits:
ephemeral-storage: "2Gi"
That allows your pod to request 1 GiB of ephemeral storage and limit it to using 2 GiB. Adjust these values as needed based on the size of the images you are dealing with.
But another approach would be to consider using Persistent Volumes (PV): If your application needs to store a lot of data (like many large image files), consider using a Persistent Volume (PV) and Persistent Volume Claim (PVC). PVs represent physical storage in a cluster and can be used to provision durable storage resources. You would need to change your application's code or configuration to write to this PV.
Define a PV and PVC:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
And in your pod spec, you would add:
volumes:
- name: my-storage
persistentVolumeClaim:
claimName: my-pvc
And mount it into your pod:
volumeMounts:
- mountPath: "/code/pca_back_data/media"
name: my-storage
When you create the Persistent Volume (PV) and mount it into your pod at the same location (/code/pca_back_data/media
), your application will continue to write to the same directory without needing to change the Django settings.
The only difference is that the storage will now be backed by a Persistent Volume which is designed to handle larger amounts of data and will not be subject to the same restrictions as the pod's ephemeral storage.
In that case, no changes would be required in your Django settings. The application would continue to write to the same path but the underlying storage mechanism would have changed.
However, do note that hostPath
should be used only for development or testing. For production, consider using a networked storage like an NFS server, or a cloud provider's storage service.
I am already using PVC that is attached to this pod. And it has more than enough storage. What is stranger is that not all files are failing this… –
As I commented, it could be a concurrency issue: If multiple processes or threads are trying to write to the same file or location simultaneously, that might cause some operations to fail with "No space left on device
" errors.
Also, Although the PVC has enough available space, individual filesystems on the PVC might have quotas that limit how much space they can use. Check if there are any such quotas set on your filesystem.
The OP confirms:
There is something like this happening - multiple processes are using the same PVC directory, maybe not exactly same file but same parent directory can be accessed by those processes.
Multiple processes using the same PVC directory or parent directory should generally not be a problem, as long as they are not trying to write to the same file at the same time. But if these processes are creating a large number of files or very large files, and if your PVC or underlying filesystem has a limit on the number of files (inodes) or the total size of files it can handle, that could potentially lead to the "No space left on device" error.
You can check for filesystem quotas on a PVC:
Connect to your pod: kubectl exec -it <your-pod> -- /bin/bash
Install the quota
package: This can usually be done with apt-get install quota
on Debian/Ubuntu systems or yum install quota
on CentOS/RHEL systems. If these commands do not work, you may need to look up how to install quota
for your specific container's operating system.
Check quotas: Run quota -v
to view quota information. If quotas are enabled and you are nearing or at the limit, you will see that here.
If your filesystem does not support quotas or they are not enabled, you will not get useful output from quota -v
. In that case, or if you are unable to install the quota
package, you might need to check for quotas from outside the pod, which would depend on your Kubernetes setup and cloud provider.
If you are still having trouble, another possible culprit could be a Linux kernel parameter called fs.inotify.max_user_watches
, which can limit the number of files the system can monitor for changes. If you are opening and not properly closing a large number of files, you could be hitting this limit. You can check its value with cat /proc/sys/fs/inotify/max_user_watches
and increase it if necessary.
The OP adds:
I think the issue in my case is that /tmp
folder inside the pod is running out of space (in Django /tmp
is used for the files when saving to database if I understand correctly), not sure how to expand size of it?
Yes, you're correct. Django, like many other systems, uses the /tmp
directory to handle temporary files, which includes processing file uploads. If the /tmp
directory is running out of space, you can consider the following options:
- Increase the Pod ephemeral storage limit: as mentioned above, you can adjust the ephemeral storage requests and limits in your pod or deployment configuration, like so:
resources:
requests:
ephemeral-storage: "2Gi" # Request 2Gi of ephemeral storage
limits:
ephemeral-storage: "4Gi" # Limit ephemeral storage usage to 4Gi
Remember to adjust these values according to your needs.
- Or use an
emptyDir
Volume for /tmp
: meaning use a Kubernetes emptyDir
volume for your /tmp
directory. When a Pod is assigned to a Node, Kubernetes will create an emptyDir
volume for that Pod, and it will exist as long as that Pod is running on that node. The emptyDir
volume can use the node's storage space, and you can specify a size limit.
Here is how you might define an emptyDir
volume for /tmp
:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-container
image: my-image
volumeMounts:
- name: tmp-storage
mountPath: /tmp
volumes:
- name: tmp-storage
emptyDir:
medium: "Memory"
sizeLimit: "2Gi" # Set a size limit for the volume
The medium: "Memory"
means that the emptyDir
volume is backed by memory (tmpfs) instead of disk storage. If you remove this line, the emptyDir
volume will use disk storage. The sizeLimit
is optional.
- You can also consider using a dedicated PVC for
/tmp
: If the above options are not feasible or if you need more control over the storage for /tmp
, you can also use a dedicated PVC for it, similar to the one you're using for /code/pca_back_data/media
.
Remember that changes to your pod or deployment configuration need to be applied with kubectl apply -f <configuration-file>
, and you may need to recreate your pod or deployment for the changes to take effect.
The OP concludes in the comments:
I managed to solve this issue: looks like the GCP storage disc was somehow corrupted and we changed to another and it seems to be fine now.