We have done deployment for Vespa using Kubernetes on the GKE cluster with 3 nodes while creating a Dockerfile we took Vespa 7.351.32 version as a base image and added a few more things to it
- GCloud SDK
- Some script files that copy our logs to GCS
- workspace folder
The workspace folder contains all the necessary .xml and other files required for the Vespa deployment.
Below are the steps we execute inside three PODs to deploy and restart the config server
/opt/vespa/bin/vespa-deploy prepare /workspace && /opt/vespa/bin/vespa-deploy activate
wait (5 min)
vespa-stop-services
vespa-stop-configserver
wait(15min)
vespa-start-configserver
vespa-start-services
vespa-get-cluster-state
vespa-config-status
Then we receive the following error.
Please find below the screenshot for the connectivity to 2181 ports on all three pods.
Upon further inspection of logs(using vespa-logfmt -l error), we found that com.yahoo.container.handler.threadpool.threadpool.DefaultContainerTHreadpool
bundle fails to load. Manually restarting the config server and Vespa services seems to solve the issue.
Attaching the related log below.
Please help us in understand the following points:
Does some service need to be running before this bundle is loaded?
Is there a path issue? If so where can we find this bundle?
Is this because of any memory issue(we have the recommended 4G)?
How does vespa load these bundles?
Below are the additional details. for the setup.
Dockerfile
FROM vespaengine/vespa:7.351.32
#Copy Neccessary Files
RUN mkdir -p workspace
COPY workspace /workspace
RUN yum install python3
COPY backup-pod.sh /
# Downloading gcloud package
RUN curl https://dl.google.com/dl/cloudsdk/release/google-cloud-sdk.tar.gz > /tmp/google-cloud-sdk.tar.gz
# Installing the package
RUN mkdir -p /usr/local/gcloud \
&& tar -C /usr/local/gcloud -xvf /tmp/google-cloud-sdk.tar.gz \
&& /usr/local/gcloud/google-cloud-sdk/install.sh
# Adding the package path to local
ENV PATH $PATH:/usr/local/gcloud/google-cloud-sdk/bin
Manifest
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: vespa
namespace: vespa
labels:
app: vespa
spec:
replicas: 3
#serviceName: vespa
selector:
matchLabels:
app: vespa
name: vespa-internal
serviceName: vespa-internal
template:
metadata:
labels:
app: vespa
name: vespa-internal
spec:
serviceAccount: vespa-sa
# nodeSelector:
# iam.gke.io/gke-metadata-server-enabled: "true"
containers:
- name: vespa
image: asia-south1-docker.pkg.dev/aurum-projec/vespa/vespa:latest
imagePullPolicy: Always
securityContext:
privileged: true
ports:
- containerPort: 8080
protocol: TCP
readinessProbe:
httpGet:
path: /ApplicationStatus
port: 19071
scheme: HTTP
volumeMounts:
- name: vespa-var
mountPath: /opt/vespa/var
- name: vespa-logs
mountPath: /opt/vespa/logs
resources:
requests:
memory: "2G"
limits:
memory: "2G"
volumeClaimTemplates:
- metadata:
name: vespa-var
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
- metadata:
name: vespa-logs
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi