4

we developed a web app that relies on real-time interaction between our users. We use Angular for the frontend and Hasura with GraphQL on Postgres as our backend. What we noticed is that when more than 300 users are active at the same time we experience crucial performance losses.
Therefore, we want to improve our subscriptions setup.
We think that possible issues could be:

  1. Too many subscriptions
  2. too large and complex subscriptions, too many forks in the subscription

Concerning 1. each user has approximately 5-10 subscriptions active when using the web app. Concerning 2. we have subscriptions that are complex as we join up to 6 tables together.

The solutions we think of:

  1. Use more queries and limit the use of subscriptions on fields that are totally necessary to be real-time.
  2. Split up complex queries/subscriptions in multiple smaller ones.

    Are we missing another possible cause? What else can we use to improve the overall performance?

    Thank you for your input!
Flowmerca
  • 165
  • 1
  • 5
  • You don't include anything about PostgreSQL here, other than just that you are using it. You probably shouldn't use the postgresql tag if you aren't interested in digging into PostgreSQL-specific issues, such as like logging slow queries and doing `EXPLAIN (ANALYZE, BUFFER)` on them. – jjanes Jul 28 '21 at 16:19
  • I think the 1st is the correct one. What I encountered: programmers did a lot of heavy subscriptions with all needed fields to draw a grid. We splitted logic in parts: a) subscription just for a single field: change date, b) when application saw that value is changed it runs c) query for whole dataset. Note for the subscription: it's OK if it returns sometimes false positives - anyway it's better to make unneeded query for whole dataset once or twice in a minute then to do that every second. After we did that: CPU usage on postgresql instance went down drastically – Alex Yu Jul 29 '21 at 04:10
  • 1
    About "too many subscription": hasura can multiplex several subscriptions into a single one - personaly I would not count on that. What I would do: I would try to create a function that detects a fact: "something was **possibly** changed and it's better to requery all data". Let's say you have entityA, entityB, ..., entityZ - every one can be changed. You can create a single subscription that detects fact "data in one or all entities is changed" and trigger queries for entityA-entityZ. Single subscription -> multiple queries. – Alex Yu Jul 29 '21 at 04:23
  • It would be great if try to dissect your problem, find a simplified example(s) that illustrates it and then add it to your question – Alex Yu Jul 30 '21 at 05:29

1 Answers1

12

Preface

OP question is quite broad and impossible to be answered in a general case.

So what I describe here reflects my experience with optimization of subscriptions - it's for OP to decide is it reflects their situation.

Short description of system

Users of system: uploads documents, extracts information, prepare new documents, converse during process (IM-like functionality), there are AI-bots that tries to reduce the burden of repetitive tasks, services that exchange data with external systems.

There are a lot of entities, a lot of interaction between both human and robot participants. Plus quite complex authorization rules: visibility of data depends on organization, departments and content of documents.

What was on start

At first it was:

  • programmer wrote a graphql-query for whole data needed for application
  • changed query to subscription
  • finish

It was OK for first 2-3 months then:

  • queries became more complex and then even more complex
  • amount of subscriptions grew
  • UI became lagging
  • DB instance is always near 100% load. Even during nights and weekends. Because somebody did not close application

First we did optimization of queries itself but it did not suffice:

  • some things are rightfully costly: JOINs, existence predicates, data itself grew significantly
  • network part: you can optimize DB but just to transfer all needed data has it's cost

Optimization of subscriptions

Step I. Split subscriptions: subscribe for change date, query on change

Instead of complex subscription for whole data split into parts:

A. Subscription for a single field that indicates that entity was changed

E.g.

Instead of:

subscription{
  document{
    id
    title
    # other fields
    pages{  # array relation
    ...
    } 
    tasks{ # array relation
    ...
    } 
    # multiple other array/object relations
    # pagination and ordering
  }

that returns thousands of rows.

Create a function that:

  • accepts hasura_session - so that results are individual per user
  • returns just one field: max_change_date

So it became:

subscription{
  doc_change_date{
    max_change_date
  }
}

Always one row and always one field

B. Change of application logic

  • Query whole data
  • Subscribe for doc_change_date
  • memorize value of max_change_date
  • if max_change_date changed - requery data

Notes

It's absolutely OK if subscription function sometimes returns false positives.

There is no need to replicate all predicates from source query to subscription function.

E.g.

In our case: visibility of data depends on organizations and departments (and even more).

So if a user of one department creates/modifies document - this change is not visible to user of other department.

But those changes are like ones/twice in a minute per organization.

So for subscription function we can ignore those granularity and calculate max_change_date for whole organization.

It's beneficial to have faster and cruder subscription function: it will trigger refresh of data more frequently but whole cost will be less.

Step II. Multiplex subscriptions

The first step is a crucial one.

And Hasura has a multiplexing of subscriptions: https://hasura.io/docs/latest/graphql/core/databases/postgres/subscriptions/execution-and-performance.html#subscription-multiplexing

So in theory Hasura could be smart enough and solve your problems.

But if you think "explicit better than implicit" there is another step you can do.

In our case:

  • user(s) uploads documents
  • combines them in dossiers
  • create new document types
  • converse with other

So subscriptions becames: doc_change_date, dossier_change_date, msg_change_date and so on.

But actually it could be beneficial to have just one subscription: "hey! there are changes for you!"

So instead of multiple subscriptions application makes just one.

Note

We thought about 2 formats of multiplexed subscription:

  • A. Subscription returns just one field {max_change_date} that is accumulative for all entities
  • B. Subscription returns more granular result: {doc_change_date, dossier_change_date, msg_change_date}

Right now "A" works for us. But maybe we change to "B" in future.

Step III. What we would do differently with hasura 2.0

That's what we did not tried yet.

Hasura 2.0 allows registering VOLATILE functions for queries.

That allows creation of functions with memoization in DB:

  • you define a cache for function call presumably in a table
  • then on function call you first look in cache
  • if not exists: add values to cache
  • return result from cache

That allows further optimizations both for subscription functions and query functions.

Note

Actually it's possible to do that without waiting for hasura 2.0 but it requires trickery on postgresql side:

  • you create VOLATILE function that did real work
  • and another function that's defined as STABLE that calls VOLATILE function. This function could be registered in hasura

It works but that's trick is hard to recommend.

Who knows, maybe future postgresql versions or updates will make it impossible.

Summary

That's everything that I can say on the topic right now.

Actually I would be glad to read something similar a year ago.

If somebody sees some pitfalls - please comment, I would be glad to hear opinions and maybe alternative ways.

I hope that this explanation will help somebody or at least provoke thought how to deal with subscriptions in other ways.

Manquer
  • 7,390
  • 8
  • 42
  • 69
Alex Yu
  • 3,412
  • 1
  • 25
  • 38
  • 1
    Great answer! Thanks for sharing such detailed insights and experience – Jesse Carter Aug 17 '21 at 19:44
  • Thanks for sharing! We tried different things in the last weeks and had similiar conclusions. – Flowmerca Sep 10 '21 at 18:17
  • 1
    1) less subscriptions: We started to subscribe only the really necessary properties and had queries for everything else, this approach also helped us evaluate parts of our app again and remove unnecessary requests/parts of requests 2) local caching We integrated a caching Service, this way data that would unlikely change in a session(e.g. user names etc) were saved there and we could remove these properties from our queries/subscriptions, this massively increased our performance – Flowmerca Sep 10 '21 at 18:35
  • Local cache: we never did subscriptions for rarely changing data so there is no benefits in using it and using it for frequently changing data is contradictory. So we did not found use case for local cache yet. – Alex Yu Sep 10 '21 at 19:51