Convert Databricks notebook to .py file in workspace

Question

The actual problem I'm trying to solve is that I'm using mkdocs/mkdocs-materials for my documentation. But that tool can't work with notebook type files.

So as a clumsy workaround I'm figuring is to have an intermediate step that creates a copy of the notebook content as a .py file, in the same workspace folder. Have mkdocs build off of those copies. Then delete the copies before pushing.

For example I've got a notebook type object in my workspace. Display looks like this:

%sql
select * from something

%sql
select * from something_else

def some_dummy_function():
    print('dummy')

When you export a notebook as a source python file via the GUI, you get this with all the tagging for syntax.

# Databricks notebook source
# MAGIC %sql 
# MAGIC select * from something
# COMMAND ----------

# MAGIC %sql
# MAGIC select * from something_else

def some_dummy_function():
    print('dummy')

I want to get this programmatically, from a notebook in a workspace.

Or if you've got suggestions for the root problem at hand ... all ears.

score 0 · Accepted Answer · answered Jun 27 '23 at 21:15

Cobbling together using this as a reference, mainly for the base64 decode idea:

String search in all Databricks notebook in workspace level

and this handy package: https://pypi.org/project/databricks-api/

pip install databricks-api

from databricks_api import DatabricksAPI
import base64

notebook_context = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
databricks_api_instance = DatabricksAPI(
      host=notebook_context.apiUrl().getOrElse(None),
      token=notebook_context.apiToken().getOrElse(None)
  )

response = databricks_api_instance.workspace.export_workspace(
    f"/Repos/me@my_company.com/my_repo/my_notebook",
    format="SOURCE",
    direct_download=None,
    headers=None,
)

notebook_content = base64.b64decode(response['content']).decode("utf-8")

with open("/Workspace/Repos/me@my_company.com/my_repo/new_file_name.py","w") as f:
    f.write(notebook_content)

don't use that package, it's not supported - use new official SDK: https://pypi.org/project/databricks-sdk/ — Alex Ott, Jun 28 '23 at 06:59

Convert Databricks notebook to .py file in workspace

1 Answers1