Highest Voted 'spark-notebook' Questions

0

votes

1 answer

Azure Spark Notebook Processing Large Text and/or Binary files

Reference Azure Synapse Pipeline running Spark Notebook Generates Random Errors for more information on this. I have been fighting getting an Azure Synapse Spark Notebook to process an uncompressed 778MB IIS file. The previous link shows some of…

c# azure-synapse spark-notebook

asked Mar 15 '22 at 15:51

bmukes

119
2
9

0

votes

3 answers

Azure Synapse Pipeline running Spark Notebook Generates Random Errors

I am processing approximately 19,710 directories containing IIS log files in an Azure Synapse Spark notebook. There are 3 IIS log files in each directory. The notebook reads the 3 files located in the directory and converts them from text…

c# azure-synapse spark-notebook

asked Mar 01 '22 at 22:58

bmukes

119
2
9

0

votes

1 answer

HowTo Flatten simple Json file in Azure Synapse Spark Notebook and convert to Parquet

I needed to flatten a simple Json file (json lines) and convert it into a Parquet format within a Spark Notebook in Azure Synapse Analytics. There is only one level of nested object for any column. However, I discovered that getting the schema of…

c# json parquet azure-synapse spark-notebook

asked Feb 11 '22 at 18:10

bmukes

119
2
9

0

votes

1 answer

Convert String to Date Time Filed in Azure Data Bricks

I have the following text string that represents a date time from an application . 2021-11-22 07:28:47 PM I need to convert this to a date time to do a DATE ADD operation . I have tried this many ways with no success and it gives me null in Azure…

azure apache-spark-sql azure-databricks spark-notebook

asked Nov 22 '21 at 21:14

James Khan

773
2
18
46

0

votes

0 answers

Orchestrate Azure synapse spark notebook from C#/api

Is there a way to execute notebook from c# like an api or sdk. I found the following to create and update notebooks https://learn.microsoft.com/en-us/dotnet/api/overview/azure/analytics.synapse.artifacts-readme-pre, nothing to trigger it like how I…

apache-spark pyspark azure-synapse azure-sdk-.net spark-notebook

asked Nov 08 '21 at 21:49

user2934433

343
1
5
20

0

votes

1 answer

spark.sql write to csv cause shifted column data issue when comma is there

I'm using scala as programming language in my azure databricks notebook, where my dataframe giving me accurate result, but when I'm trying to store the same in csv it shifting the cell where comma(,) is coming spark.sql(""" SELECT * FROM…

scala csv azure-databricks spark-notebook

asked Oct 25 '21 at 09:51

Manish Jain

217
1
4
16

0

votes

1 answer

display(df.limit(10)) does not always work in synapse notebooks

Within synapse notebooks, running display(df.limit(10)) does not always work. It usually works when the notebook is first run, but after a while, if i run it again, it does not display the df. The server has not died or timed out, code is still…

azure-synapse spark-notebook

asked Aug 26 '21 at 12:41

wilson_smyth

1,202
1
14
39

0

votes

1 answer

Azure Synapse Pipeline Notebook Return Error

I want to create pipeline on Azure Synapse and one of the flow is using notebook to read, validate and then continue the pipeline or stop the pipeline if(validation=True): #success on validation return df #continue the…

python azure apache-spark azure-synapse spark-notebook

asked Jul 04 '21 at 09:17

OctavianWR

217
1
16

0

votes

0 answers

How do I create a Sequence in Pyspark that resets when rows change from 0 to 1 and and increments when all are 1's

I have a pyspark dataframe like this and need the SEQ output as shown: R_ID ORDER SC_ITEM seq A 1 0 A 3 1 1 A 4 1 2 A 5 1 3 A 6 1 4 A 7 1 5 A 8 1 6 A 9 1 7 A 10 0 0 A 11 1 1 A 12 0…

python pyspark spark-notebook

asked May 10 '21 at 14:09

Shay Pal

1

0

votes

1 answer

Filter like %[A-Za-z]% in databricks

I am trying to use table.column LIKE '%[A-Za-z]% in Databricks notebook, but it returns no value. It worked in SQL server, but it seems it's not working in Pysql. Does anyone know what's the alternative in Databricks?

apache-spark tsql databricks spark-notebook

asked Feb 16 '21 at 12:32

cornerstone347

27
4

0

votes

1 answer

Install interpreter for Zeppelin

I need to custom install interpreter for zeppelin apache. Not all of interpreter, i only need md, shell, python (default), jdbc, spark (default). I do some ways, but it failed: Install online via command ./bin/install-interpreter.sh --name…

apache-zeppelin spark-notebook

asked Nov 24 '20 at 14:13

qxk71551

95
9

0

votes

2 answers

Writing parquet file throws...An HTTP header that's mandatory for this request is not specified

I have two ADLSv2 storage accounts, both are hierarchical namespace enabled. In my Python Notebook, I'm reading a CSV file from one storage account and writing as parquet file in another storage, after some enrichment. I am getting below error when…

parquet azure-databricks azure-blob-storage spark-notebook

asked Oct 12 '20 at 23:33

user3023949

121
2
8

0

votes

2 answers

Azure databricks job - notebook snapshot

We are running scheduled databricks jobs on a daily basis in Azure databricks and it runs successfully on all days. But today (29th Sept 2020), the job is failing within few seconds with Internal Error. The error message is given below: Error…

azure databricks azure-databricks spark-notebook

asked Sep 29 '20 at 10:07

Saravanan

49
6

0

votes

1 answer

How to call remote SQL function inside PySpark or Scala databriks notebook

I am writing databriks scala / python notebook which connect SQL server database. and i want to execute sql server function from notebook with custom paramters. import com.microsoft.azure.sqldb.spark.config.Config import…

scala function pyspark azure-databricks spark-notebook

asked Jun 18 '20 at 08:39

rohit patil

159
2
9

0

votes

1 answer

why some notes in spark works very slow? and why multiple execution in same situation has different execution time?

My question is about the execution time of pyspark codes in zeppelin. I have some notes and I work with some SQL's in it. in one of my notes, I convert my dataframe to panda with .topandas() function. size of my data is about 600 megabyte. my…

pandas apache-spark apache-zeppelin spark-notebook

asked May 26 '20 at 11:05

Saeed

159
3
13

Questions tagged [spark-notebook]