Questions tagged [spline-data-lineage-tracker]
10 questions
3
votes
2 answers
Error enabling lineage in spark using spline?
I tried using spline to track the lineage in spark using both ways specified here
But both of them failed with same error
ERROR QueryExecutionEventHandlerFactory: Spline Initialization Failed! Spark lineage tracking is disabled
Spark Agent was not…

Shubham Jain
- 5,327
- 2
- 15
- 38
3
votes
2 answers
Errors while installing Spline (Data Lineage Tool for Spark)
I am trying to install Apache Spline in Windows.
My Spark version is 2.4.0
Scala version is 2.12.0
I am following the steps mentioned here https://absaoss.github.io/spline/
I ran the docker-compose command and the UI is up
wget…

Ayan Biswas
- 1,641
- 9
- 39
- 66
1
vote
1 answer
Cant view lineage using UI for spline
I have tried everything, the code even writes the data.
But spline is unable to pick it up.
My code runs sucessfully but there is no data in spline UI.
Spark - 3.3.1
Scala - 2.12.18
Python - 3.9.6
Spline agent - 1.1.0
Can someone guide me in…

user22032314
- 11
- 1
1
vote
1 answer
Unable to view the pyspark job in Spline using ec2 instance
We created a sample pyspark job and gave the spark-submit commands as following in ec2 instance
sudo ./bin/spark-submit --packages za.co.absa.spline.agent.spark:spark-3.1-spline-agent-bundle_2.12:0.6.1 --conf…

dhanju
- 33
- 4
1
vote
1 answer
Finding spark pipeline start time from spline lineage
Im exploring spline to determine how much time it took for spark to execute a pipeline (from initialising spark context till writing the result). I could see
"timestamp":1611397050192
in the Spline lineage file which is actually write time. Is…

syv
- 3,528
- 7
- 35
- 50
0
votes
1 answer
Unable to use Spline Lineage with AWS Glue 4.0 | Failure
I'm trying to capture the Lineage of a PySpark job using Spline in AWS Glue that does transformations using DataFrame APIs and then writes the output in S3 as Delta tables.
For now, I want the lineage on the console, but end state I want to capture…

Mohd Jaleel
- 3
- 2
0
votes
1 answer
Spline, pyspark: How to get spline console output in my python code?
In my pyspark code im reading test csv file, filtering it, and writing. All that actions i can see in console with LoggingLineageDispatcher in json format, but i want to find a way to get this data right in my python code. Cant find any options for…

Andrej Vilenskij
- 487
- 1
- 7
- 23
0
votes
1 answer
Azure Databricks: trying to run Spline for capturing Spark lineage?
I am trying to set up Spline in Azure Databricks but facing this issue, any help regarding this?
:6: error: identifier expected but double literal found.
--packages za.co.absa.spline.agent.spark:spark-3.0-spline-agent-bundle_2.12:0.6.1…

ShadowWarrior
- 180
- 1
- 12
0
votes
1 answer
Need to Re-Write Scala Code for Specific JSON Output
I am trying to register Databricks notebook lineage to Azure Purview through spline and apacheatlas api. There are two versions of the code: 1) is the original code which uses databricks runtime version 6.4 and is working as expected but we need to…

datadude123
- 21
- 6
0
votes
1 answer
spline spark agent jar has errors during post processing
I have been trying to run the following code with the new spline jsr: za.co.absa.spline.agent.spark:spark-3.0-spline-agent-bundle_2.12:0.6.0 but have been getting errors specific to UserExtraMetadataProvider which has been deprecated in the newer…

datadude123
- 21
- 6