0

I was having a requirement to watch on the latest documents from the mongodb. I have used the ChangeStream watch API to fetch the stream documents from the collection.

The setup I have is a replica set with 3 nodes running in the same system with ports 27017,27018 and 27019. the setup doesn't have any auth setup.

mongodb.conf file:

systemLog:
  destination: file
  logAppend: true
  path: /mongodb/logs/mongodb.log
storage:
  dbPath: /mongodb/data/d1
  journal:
    enabled: true
  engine: "wiredTiger"
  wiredTiger:
    engineConfig:
      cacheSizeGB: 4
net:
  port: 27017
  bindIp: localhost  

I have performed a bulk insert of the file which has 72663 documents in it. And records processed per second I got out of the below program are just 8073.

the Java code I had to watch on is.

   List<ServerAddress> serverAddress = Arrays.asList(new ServerAddress("localhost", 27019),new ServerAddress("localhost", 27018), new ServerAddress("localhost", 27017));
   MongoClientSettings settings = MongoClientSettings.builder()
            .applyToClusterSettings(builder -> builder.hosts(serverAddress)).build();
    
    MongoClient client = MongoClients.create(settings);
    int count = 0;
    Instant start = null;
    
    MongoChangeStreamCursor<ChangeStreamDocument<Document>> dep = client.getDatabase("MyDB").getCollection("TestCollection").watch().cursor();
    
    while (true) {
        while (dep.hasNext()) {
            if (count == 1) {
                start = Instant.now();
            }
            count++;
            ChangeStreamDocument<Document> next = dep.next();
            
            if (count == 72663) {
                Instant end = Instant.now();
                Duration timeElapsed = Duration.between(start, end);
                long seconds = timeElapsed.getSeconds();
                long rec = count / seconds;
                System.out.println("records processed per second  " + rec);
            }
            
        }

Is there a way to get a better performance out of the change stream API. Or is there any other API which can give me better performance in watching the documents. Or any other replication properties which can give a better performance.

Forketyfork
  • 7,416
  • 1
  • 26
  • 33
  • Are you asking about write performance or change stream performance and what is your expectation? – D. SM Jul 21 '20 at 22:49
  • I was checking on the ChangeStream performance. I was expecting more than 50000 documents per second by using the change stream watch API. – sabareesh babu Jul 22 '20 at 04:43

1 Answers1

0

I wrote and ran a benchmark.

On a $100 consumer grade SFF desktop using i5-4460S, with the database in memory, I could obtain 17k documents written per second to zram. The database was CPU limited.

At this point the change stream performance is bound by the insert performance, and the change stream delivered the 17k changes/sec.

The change stream was however bursting and the bursts were showing higher throughput than what the database could do on this hardware with sustained writes.

Based on this I suggest that change stream performance exceeds the capability of the database to process writes.

D. SM
  • 13,584
  • 3
  • 12
  • 21