Questions tagged [spark-notebook]

The Spark Notebook is a web application enabling interactive and reproductible data analysis using Apache Spark from the browser

120 questions
3
votes
2 answers

odd error when populating accumulo 1.6 mutation object via spark-notebook

using spark-notebook to update an accumulo table. employing the method specified in both the accumulo documentation and the accumulo example code. Below is verbatim what I put into notebook, and the responses: val clientRqrdTble = new…
snerd
  • 1,238
  • 1
  • 14
  • 28
2
votes
1 answer

Azure Synapse - How to stop an Apache Spark application / notebook?

When I run (in debug mode) a Spark notebook in Azure Synapse Analytics, it doesn't seem to shutdown as expected. In the last cell I call: mssparkutils.notebook.exit("exiting notebook") But then when I fire off another notebook (again in debug mode,…
Dudeman3000
  • 551
  • 8
  • 21
2
votes
0 answers

Making parameters work in Azure .NET C# Spark Notebook

I tried to pass a string parameter into a Spark NoteBook written entirely in .NET Spark C# No matter what I tried it did not work. What finally did work was to Define the notebook as PySharp Define the parameter - PySharp Put the parameter value…
bmukes
  • 119
  • 2
  • 9
2
votes
2 answers

How to return integer value from notebook in adf pipeline

I have a usecase where I need to return an integer as output from a synapse notebook in pipeline and pass this output in next stage of my pipeline. Currently mssparkutils.notebook.exit() takes only string values. Is there any utility methods…
boom_clap
  • 129
  • 1
  • 12
2
votes
1 answer

Azure databricks CI CD pipeline to delete notebooks on production

I have a CI/CD pipeline in place to deploy notebooks from dev to production in an Azure databricks workspace. However, it is not deleting the notebooks from production, when those notebooks have been removed from development and are no longer in…
2
votes
1 answer

Processing tables in parallel using Azure Data Factory, single pipeline, single Databricks Notebook?

I want to transform a list of tables in parallel using Azure Data Factory and one single Databricks Notebook. I already have an Azure Data Factory (ADF) pipeline that receives a list of tables as a parameter, sets each table from the table list as…
2
votes
0 answers

p.nettyException - Exception caught in Netty java.lang.NoSuchMethodError:

I compiled spark-notebook from sources and am getting an error when trying to run it. Something is wrong with netty version. Well, there are a lot of components in spark-notebooks. And those components require different netty versions. I tried to…
2
votes
2 answers

How to connect Spark-Notebook to Hive metastore?

This is a cluster with Hadoop 2.5.0, Spark 1.2.0, Scala 2.10, provided by CDH 5.3.2. I used a compiled spark-notebook distro It seems Spark-Notebook cannot find the Hive metastore by default. How to specify the location of hive-site.xml for…
Rex
  • 2,097
  • 5
  • 16
  • 18
1
vote
0 answers

How to make use of custom jar in Synapse Notebook after uploading it in Azure Synapse Workspace?

I am trying to add my customized jar in Azure Synapse Workspace to make use of user defined function (udf) present in the jar while running the sql query in Synapse Notebook. An Example: There is udf named as MapCloud() registered in UDFHelper.scala…
1
vote
0 answers

Create a custom magic in a Glue Studio Notebook

I've been trying to create custom magic for a Glue Studio Notebook, like the following example (taken from here) I've adding the Ipython module by running the glue magic %additional_python_modules IPython And running this from a cell: from…
1
vote
2 answers

How to pass result of Sql query from Azure Synapse notebook to next activity in Synapse Pipeline?

I have a Main pipeline in Synapse workspace which has 2 activities: 1st - Notebook activity 2nd - If Condition activity For the 1st one (Synapse notebook, spark pool, pyspark), I have a SQL cell like the following: It has a simple query using a…
1
vote
1 answer

Azure Synapse Lake Database - Notebook cannot access information_schema

In Synapse Analytics I can write the following SQL script and it works fine: SELECT Table_name FROM dataverse_blob_blob.information_schema.tables WHERE Table_name NOT LIKE '%_partitioned' ORDER BY 1 I am trying to do the same using a…
1
vote
2 answers

Synapse : Execute the Magic Command (%run) Notebook from Pipeline

The magic commands work perfectly in Notebooks. However, While running the same notebook from the Synapse pipeline, it could not locate the notebook's path. Appreciate your help { "errorCode": "6002", "message": "MagicUsageError: Cannot read…
1
vote
0 answers

Trigger job from a notebook in Azure databricks

I have a job that accepts date as parameter. I need to pass yesterday's date as parameter to the job and schedule it. But we can't send yesterday's date(dynamic parameter) when it is scheduled. So I created a notebook (say N2) on top of this and I…
1
vote
1 answer

Errorcode:6002 in Azure Synapse Analytics pipeline

We got following error after running notebook in pipeline, in which data is transformed and the saved. While data write to csv if commented out then pipeline working. And in normal notebook run data write to csv is also working fine but only in…