I am experimenting with K8s and the spring boot liveness and readiness probes. One thing I cannot seem to understand and fix properly is how is my spring boot application supposed to recover from a failed state? I have a simple scenario - my app is connecting to an S3 bucket and tries to download the content. And I want to make sure that the when the bucket is not available, then my Readiness state is automatically changed to REFUSING_TRAFFIC
. But then when the bucket is up and available again, I want my Readiness state to be updated again to ACCEPTING_TRAFIC
.
How do I do that?
Here is what I have:
@Slf4j
public class AcpS3EnvironmentRepository extends AbstractScmEnvironmentRepository
implements EnvironmentRepository, SearchPathLocator, InitializingBean {
private static final String DEFAULT_CONFIG_VERSION = "latest";
private final AcpS3TransferManager transferManager;
private final ApplicationEventPublisher eventPublisher;
public AcpS3EnvironmentRepository(ConfigurableEnvironment environment,
AcpS3RepositoryProperties properties,
AcpS3TransferManager transferManager,
ApplicationEventPublisher publisher) {
super(environment, properties);
this.transferManager = transferManager;
this.eventPublisher = publisher;
}
@Override
public synchronized void afterPropertiesSet() {
Assert.state(getUri() != null, "You need to configure a uri for the aws s3 bucket");
}
@Override
public synchronized Locations getLocations(String application, String profile, String label) {
try {
transferManager.downloadBucket(new AmazonS3URI(getUri()).getBucket(), getBasedir().toPath());
} catch (Exception ex) {
log.error("Could not load data from bucket " + getUri(), ex);
AvailabilityChangeEvent.publish(eventPublisher, this, ReadinessState.REFUSING_TRAFFIC);
throw new AcpS3BucketIllegalException("Could not load data from bucket " + getUri());
}
return new Locations(
application,
profile,
label,
DEFAULT_CONFIG_VERSION,
getSearchLocations(
getWorkingDirectory(),
application,
profile,
label
)
);
}
}
What I observe currently:
- Make s3 not working
- readiness is refusing_traffic
- s3 is working again
- readiness is still refusing_traffic (but it should be accepting traffic this time).
So my question is how do I achieve this in the best possible way?