19

How to work around the issue of deleting data in an eventstore?

I need to permanently and completely delete some data in order to comply to privacy laws.

I have found these alternatives:

  1. Encrypt the data that you need deleted, and store the encryption key in its own table. When the data needs to be deleted, you then only delete the encryption key.

  2. Use event sourcing on the data that does not need deletion, with reference to a CRUD database for the confidential data that need to be deleted.

Are there any other ways of doing it?

Ruben Bartelink
  • 59,778
  • 26
  • 187
  • 249
Jan-Terje Sørensen
  • 14,468
  • 8
  • 37
  • 37
  • 2
    Facing the same problem, we have decided to modify the original events that contain the data to be deleted and replace all occurrences of data that is to be deleted with placeholder values. However, if your option 2. is more elegant and error-prone, although you then cannot retain the change-history (which might be o.k. in this case since it's personal data) – Alexander Langer Jul 31 '14 at 07:59
  • 2
    option 1 is the most appropriate, because you are achieving two things. 1. Securing the data and 2. when the time comes to "forget it" you simple delete the private key. The only other alternative I can add to your list, is to maintain the sensitive data in its own stream. then you simple delete the stream. – Sarmaad Jul 31 '14 at 12:10
  • The ddd-cqrs mailing list covered this in the last 2 months (and prob every 3 months before that :) – Ruben Bartelink Aug 14 '14 at 02:06
  • @RubenBartelink do you know what the conclution was? Is there a summary of the mailing list discussion? – Jan-Terje Sørensen Aug 14 '14 at 05:21
  • @arcone groups.google.com has it. It was a long thread and lots of choices and great insights, together with nice examples. No need for me to waste my time doing a botched synopsis. Did you search for the mailing list? The list is required reading for ES systems so go! – Ruben Bartelink Aug 14 '14 at 15:15
  • hi Ruben, can you send a copy of the article from email mailing list? Thanks, –  Jan 07 '18 at 18:49

3 Answers3

6

I did that a month ago. Tried to make it as simple as possible. I just replayed the entire event store, modify event data and finally store the event in a new event store. In other words migration. When everything finished OK I deleted/backup the old store. After that I replayed the new event store against the projections because of the changes.

If you do not have the encryption implemented you have to add it somehow. Like replaying the entire event store.

PS: Just want to mention for other readers that the reasons to change the event store are really limited. Do not do use it except when comply to privacy laws or really nasty bug. If you need to delete user's data you could do one of the two things:

  • Encrypt all user's data and when you have to delete it you just get rid of the private key.
  • Place all user's data in a separate store/database and when needed you could just delete it without affecting other parts of the system.
mynkow
  • 4,408
  • 4
  • 38
  • 65
5

First, change your event handlers to not require the data so that things don't break when you remove it.

Then create a small app to read all your events, and write new events to a new event store without the data you needed deleted.

Test that your system still functions using the new event store; can rehydrate all aggregates, and generate all projections/views/readmodels/whateveryoucallthem.

Delete the old event store.

CaffGeek
  • 21,856
  • 17
  • 100
  • 184
3

EventStoreDB from Event Store allows you to scavenge events with expired TTL. Usually, these are temporary events like stats, or something you have i.e. must be removed after a period of time.

In order not to break your model, one would typically use snapshotting to fix the entity state at some time and then previous events can be deleted without breaking the system.

Alexey Zimarev
  • 17,944
  • 2
  • 55
  • 83