0

I've existing HDInsight installation. On the same, I've created few files using PySpark with Python 3 support.

I intend to make call to this Python notebook via REST API, and Livy Server seems to be the way forward.

The problem that am facing is that with Livy Server, exposing Python Notebook is not working.

Is there any way to allow Python Notebooks to be called externally via Livy APIs?

Sanket Tarun Shah
  • 637
  • 10
  • 28

2 Answers2

1

If I understand your question correctly, you have:

  • a running Spark cluster in HDInsight
  • a python notebook (which I'm assuming is Jupyter) that you have either on your local computer or on a virtual machine.

If so, you can set up sparkmagic on your local machine and configure the .config file in sparkmagic to connect to your HDInsight Spark Cluster. Install Jupyter notebook on your computer and connect to Apache Spark on HDInsight

sparkmagic is a livy client to interactively work with remote Spark clusters through Livy.

rs323
  • 31
  • 4
-1

Not sure about notebook but the HDInsight SDK for Python provides classes and methods that allow you to manage your HDInsight clusters. It includes operations to create, delete, update, list, resize, execute script actions, monitor, get properties of HDInsight clusters, and more.

PIP package for the same:

pip install azure-mgmt-hdinsight

The SDK first needs to be authenticated with your Azure subscription.

Login:

from azure.mgmt.hdinsight import HDInsightManagementClient
from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.hdinsight.models import *

# Tenant ID for your Azure Subscription
TENANT_ID = ''
# Your Service Principal App Client ID
CLIENT_ID = ''
# Your Service Principal Client Secret
CLIENT_SECRET = ''
# Your Azure Subscription ID
SUBSCRIPTION_ID = ''

credentials = ServicePrincipalCredentials(
    client_id = CLIENT_ID,
    secret = CLIENT_SECRET,
    tenant = TENANT_ID
)

client = HDInsightManagementClient(credentials, SUBSCRIPTION_ID)

HDInsight provides a configuration method called script actions that invokes custom scripts to customize the cluster.

script_action1 = RuntimeScriptAction(name="<Script Name>", uri="<URL To Script>", roles=[<List of Roles>]) #valid roles are "headnode", "workernode", "zookeepernode", and "edgenode"

client.clusters.execute_script_actions("<Resource Group Name>", "<Cluster Name>", <persist_on_success (bool)>, script_actions=[script_action1]) #add more RuntimeScriptActions to the list to execute multiple scripts

To list all persisted script actions for the specified cluster:

scripts_paged = client.script_actions.list_persisted_scripts(resource_group_name, cluster_name)
while True:
  try:
    for script in scripts_paged.advance_page():
      print(script)
  except StopIteration:
    break

See if it helps.

Mohit Verma
  • 5,140
  • 2
  • 12
  • 27
  • Idea is to expose the Python script that's written by me over Livy / another mechanism and NOT to manage HD Insight cluster. Target is to invoke my Python script via some sort of remote mechanism (REST API via Livy is closest I can think of) and run job on HDInsight cluster. – Sanket Tarun Shah May 24 '19 at 02:57