8

I'm new to the Delta Lake, but I want to create some indexes for fast retrieval for some tables in Delta Lake. Based on the docs, it shows that the closest is by creating the Data Skipping then indexing the skipped portion:

create DATASKIPPING index on [TableName] [DBName.]tableName

Can't seem to find other methods of creating indexes other than Data Skipping

How do I create indexes just like any tables in RDBMS, within Delta Lake?

Thanks!

Gaurang Shah
  • 11,764
  • 9
  • 74
  • 137
user12264392
  • 81
  • 1
  • 1
  • 3

1 Answers1

17

Indexing happens automatically on Databricks Delta and OSS Delta Lake as of v1.2.0. As you write data, the columns in the files you write are indexed and added to the internal table metadata. As you query the data and filter, data skipping is applied.

In addition you can use z-order on Databricks Delta to optimize the files based on specific columns. Again, indexing will still be used for the other columns as well.

Silvio
  • 3,947
  • 21
  • 22