The Spark Notebook is a web application enabling interactive and reproductible data analysis using Apache Spark from the browser
Questions tagged [spark-notebook]
120 questions
1
vote
1 answer
Spark notebook worksheets not saved with docker
When I start a spark-notebook using docker and create a new worksheet. The next time I start it, the worksheet isn't there.
Here's the command:
docker run -v /Users/pkerp/projects/chairliftplot/:/mnt -p 9000:9000…

juniper-
- 6,262
- 10
- 37
- 65
0
votes
0 answers
Is it possible to use a databricks notebook output as my data source in Power BI?
I would like to know if it's possible to use a databricks notebook output as my data source in Power BI. I don't want to export it to csv first.
I tried creating a connection to databricks from Power BI and got a list of all the tables available.…

SMGT90
- 1
- 1
0
votes
0 answers
Pie chart cannot be resized pyspark
I am trying to get a pie-chart displayed in a databricks notebook. The code to display it works, what doesnt work is size adjustments.
enter image description here
Ive tried the plt.figure(figsize=()) method, no change. Tried the ax.pie(...,…

nadishancosta
- 1
- 2
0
votes
1 answer
Saving custom logging in databricks without tye, except block
I'm trying to write a logging system in databricks for a few jobs we need to run.
Currently I'm setting up a logger and log the files in-memory -> log_stream = io.StringIO()
All functions are covered in a try, except block to catch info or exception…

Mitchell
- 1
0
votes
1 answer
How can I make my Datalake Gen2 mount points dynamic in my MS Synapse Spark Notebooks?
I am developing Spark notebooks in Microsoft Azure Synapse. I can easily pass in a parameter with the base mount point for accessing files in Datalake gen2 storage, but I would just like to use the linked service as defined in the workspace instead…
0
votes
0 answers
How to check in calling notebook if databricks job are launched manually or by scheduler?
How to check in calling notebook if databricks job are launched manually or by scheduler? any help would be really appreciated.
I need this because I have to pass the Parameters or arguments based on the launcher. in my case if the databricks job is…

Adarsh
- 1
- 1
0
votes
1 answer
Mapping Spark Dataframe Columns to SQL Table Columns in Azure Synapse Notebook
I use azure synapse notebooks (pyspark) to process data from blob storage and write it into a SQL server. I'm using the below to write to the SQL table:
df.write \
.format("jdbc") \
.option("url", <...>) \
.option("dbtable",…

Chad Goldsworthy
- 69
- 1
- 7
0
votes
1 answer
errorCode:6002 in Azure Synapse pipeline
Wen I try to execute a pipeline I get this error. I am following the steps from this github https://github.com/microsoft/OpenEduAnalytics/tree/main/modules/module_catalog/Microsoft_Education_Insights/pipeline, It does it alright when I enter the…

Magoji
- 1
- 1
0
votes
1 answer
Databricks incorrect data while writing in Delta location
I am facing below issue while writing the data in Delta location. I am getting incorrect data. I am using Python Notebook in Azure Databricks.
Dataset Used : /databricks-datasets/flights/
Below are the steps I performed.
Mount to blob…

Baxy
- 139
- 1
- 13
0
votes
1 answer
Synapse notebook incorrect detects column type
I'm import data from a csv into a parquet file using a Synapse Notebook. The code types the Zip Code field as an INT, losing the preceding 0(s) on many Zip Codes. My question is how can I force a column to be typed as a string? Here is the code…

user2197446
- 1,065
- 3
- 15
- 31
0
votes
1 answer
Synapse Spark Notebook Schema mismatch error while writing to a dedicated SQL Pool
I am getting the below error while trying to write to Synapse Dedicated SQL Pool, the dataframe and the target table has same schema, however it seems that the connector can not find some columns in the dataframe while matching the target…

DevFahim
- 1
0
votes
0 answers
Accessing a FastAPI endpoint using Personal Access Token (PAT)
I have a FastAPI endpoint on a cluster with addess 0.0.0.0:8084/predict. And I want to send a request to this endpoint from a React App which is locally hosted on my computer. I have a Personal access token for the workspace but dont know how to…

Aakash Bhandari
- 44
- 1
0
votes
1 answer
Azure Synapse .NET C# Sparkpool: Fail to start interpreter
When I am working on a .NET Spark (C#) Notebook in Azure Synapse I always get the following error message: Fail to start interpreter. detail: org.apache.spark.api.dotnet.DotnetBackend. When changing the language from .NET Spark (C#) to Python or…

tomotom12
- 46
- 6
0
votes
0 answers
Can we pass dataframes between different notebooks in databricks and sequentially run multiple notebooks?
I have to sequentially run a couple of notebooks in databricks and pass the result in the form of dataframe from one notebook to another and so on and create a table in the last notebook using databricks and pyspark.
Is this feasible ?
Unable to…

Shruti
- 1
0
votes
0 answers
Apache Spark in Azure Synapse Analytics - HTTP request in notebook
I use a Notebook in Synapse where I run my Python code.
I would like to make an API request from this Notebook to Microsoft Purview to send the entities.
I added the pyapacheatlas library to spark.
On my local computer, this code works fine in…

DieX
- 1
- 1