9

The databricks documentation describes how to do a merge for delta-tables.

In SQL the syntax

MERGE INTO [db_name.]target_table [AS target_alias]
USING [db_name.]source_table [<time_travel_version>] [AS source_alias]
ON <merge_condition>
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN NOT MATCHED [ AND <condition> ]  THEN <not_matched_action> ]

can be used. Is a python-equivalent available?

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
Erik
  • 755
  • 1
  • 5
  • 17
  • 1
    Delta lake is written in [Scala](https://github.com/delta-io) and the API itself support only Scala at the moment – abiratsis Apr 02 '20 at 19:25
  • 1
    @AlexandrosBiratsis: Thanks for the link. It turns out there is a [documented](https://docs.delta.io/latest/api/python/index.html) python api-available. – Erik Apr 05 '20 at 09:51

1 Answers1

14

I managed to find the documentation using the help of Alexandros Biratsis. The documentation can be found here. An example of such a merge is given by

deltaTable.alias("events").merge(
    source = updatesDF.alias("updates"),
    condition = "events.eventId = updates.eventId"
  ).whenMatchedUpdate(set =
    {
      "data": "updates.data",
      "count": "events.count + 1"
    }
  ).whenNotMatchedInsert(values =
    {
      "date": "updates.date",
      "eventId": "updates.eventId",
      "data": "updates.data",
      "count": "1"
    }
  ).execute()
Erik
  • 755
  • 1
  • 5
  • 17
  • 1
    But this allows only one match condition. How would it be with multiple? Or how would I add an additional conditon to each "whenMatchedUpdate"? It should be possible according to the Documentation but there is no code example in the Docu – Alex Ortner Jul 14 '22 at 13:24
  • 2
    If you want to add multiple conditions, you can do it like this: condition = "events.eventId = updates.eventId AND events.date = updates.date" – Evan Grantham-Brown Sep 22 '22 at 02:33