1

I'm working on a Spring Boot project with a Postgres database backend where JUnit 5 and Testcontainers is used for integration tests that involve database access.

Testcontainers is set up by modifying the JDBC URL like this:

spring:
  datasource:
    url: jdbc:tc:postgresql:9.6.8:///test

This setup did work fine for many months but now I'm hitting a road block.

So far there are already 20 integration test classes and adding another one leads to failing tests due to an error that looks like a time out to me.

When adding the 21st test class, another test (let's call it RandomTest) hangs for a few minutes and then fails with this error:

 java.lang.IllegalStateException at DefaultCacheAwareContextLoaderDelegate.java:98
        Caused by: org.springframework.beans.factory.BeanCreationException at AbstractAutowireCapableBeanFactory.java:1804
            Caused by: org.flywaydb.core.internal.exception.FlywaySqlException at JdbcUtils.java:68
                Caused by: java.sql.SQLException at JdbcDatabaseContainer.java:263
                    Caused by: org.postgresql.util.PSQLException at ConnectionFactoryImpl.java:659

I know it can't be a problem with the test per se, because when I run it individually, there's no problem:

./gradlew test --tests RandomTest
[...]
BUILD SUCCESSFUL in 16s

It may also be noteworthy that I only have this problem when running the tests with Gradle (both locally and on the CI server). I don't see this problem when running them in IntelliJ.

So it looks to me like this is some kind of resource problem like the Postgres instance that Testcontainers starts up running out of memory or out of connections or whatever, but that's just guessing.

I tried different configuration modifications that I found in the Testcontainers docs:

  1. Running the container in daemon mode like this:
spring:
  datasource:
    url: jdbc:tc:postgresql:9.6.8:///test?TC_DAEMON=true
  1. Disabling Ryuk by setting TESTCONTAINERS_RYUK_DISABLED=true
  2. Starting Ryuk in (un-)privileged mode explicitly with ryuk.container.privileged=true|false (I tried both because I'm not sure what the default is)

None of these had a noticeable impact in terms of my problem.

I'm thinking that maybe we are overusing Testcontainers for too many tests? Should I instead use H2 for most integration tests and use Testcontainers only for a few selected tests to make sure that everything works with the production database?

Or am I missing something?

anothernode
  • 5,100
  • 13
  • 43
  • 62
  • Why did you expect this issue is in any way related to Ryuk? We need more logs to understand the issue, but for me, this looks like an issue related to Spring connection pooling in conjunction with Flyway, hard to say. In your test setup, wouldn't you expect to reuse the same container for every test? Can you share a reproducer? Also the comment regarding IntelliJ vs Gradle execution hints at something weird, class loading issues? – Kevin Wittek Dec 21 '22 at 15:01
  • Playing with the Ryuk config was just a shot in the dark, to be honest. Only one postgres container gets started for all the tests in one test run, yes. Not sure what you mean by reproducer? Yeah, it is a bit of a weird problem :) I mean, the project is still running on a fairly old version of Spring Boot (2.5) and Flyway (7.7), so maybe I should try and update those first. Anyway, thanks for your suggestions! – anothernode Dec 22 '22 at 10:15
  • Is there anything special about your new RandomTest? What if instead of the RandomTest you add a copy of an old existing test, but with a different name, will the build still fail? I don't think it's something related to performance, I've seen setups with 100+ integrations test with testcontainers without any issues. I'd guess some of your tests messes up with the shared cached spring context and your new test is executed afterwards with this "broken" context. Btw what it the exact PSQLException? – yuppie-flu Dec 22 '22 at 19:41
  • 1
    Yeah, I think I actually found the cause of the problem lying in the newly added test as described in the answer I provided. Thanks a lot to both @KevinWittek and yuppie-flu for giving me important data points to think about! – anothernode Dec 27 '22 at 16:35

1 Answers1

0

Okay, it turned out that it actually was a problem with the newly added test.

The test author had added a method that was supposed to clean up the database after the test like this:

@AfterEach
public void beforeEach() {
    fooRepository.deleteAll();
    barRepository.deleteAll();
    bazRepository.deleteAll();
}

When removing this, all the tests work fine again. I guess this clean up takes a bit longer than execution of the test itself so that the database connection is not released in time for the next test to use it, or something like this.

anothernode
  • 5,100
  • 13
  • 43
  • 62
  • This is strange, the other tests should not start until the `AfterEach` method has finished. It still looks like some kind of Spring usage issue to me. – Kevin Wittek Jan 03 '23 at 08:18
  • 1
    Yeah, maybe, I don't know. The configuration of the database connection seems to be very basic, can't imagine what could be wrong with it. There's one fairly wild stuff going on in the entities code: a manually implemented custom Hibernate type. Maybe that one's causing trouble... But for now removing the `@AfterEach` method works fine for me. It's redundant anyway because all tests already clear the repos _before_ running. – anothernode Jan 03 '23 at 08:55