Questions tagged [data-lineage]

62 questions
1
vote
0 answers

BigQuery Data Lineage using AuditLogs, PubSub, Dataflow, ZetaSQL and Data Catalog

I have built a BigQuery data lineage system using the document provided by Google. https://cloud.google.com/architecture/building-a-bigquery-data-lineage-solution I was able to generate the table lineage for all the SQLs running on bigquery in my…
1
vote
0 answers

ZetaSQL - Creating a simple catalog with tables and columns using local service

We are using a Python client binding for ZetaSQL GRPC local service in our application to analyze statements and extract referenced tables and output columns. It is possible to extract referenced tables using the following simplified Python code and…
1
vote
2 answers

How to find column level lineage information with in the snowflake

I am trying to find column level lineage information with in the snowflake. Few blogs say's that we can build lineage from the data present in the Access_History view which is under Account Usage Schema but I could not find the relevant info from…
1
vote
0 answers

Looking for a Data Catalog and Data Lineage Tool That Can Integrate With My Snowflake and Informatica Environment

In my data platform I leverage Informatica Intelligent Cloud Services for orchestrating and processing data into my data warehouse. My data warehouse is in Snowflake. I am looking to start data governance which includes incorporating a data catalog…
1
vote
1 answer

How to check data lineage on azure databricks and HDinsight?

I have notebooks that performs transformation in tables stored in dbfs(databricks file system).I want to capture and display the data lineage. Additionally i want to know how to do the same in hdinsight.
Ayushi
  • 13
  • 3
1
vote
1 answer

How we can preserve provenance and lineage in MarkLogic

How we can preserve provenance and lineage in MarkLogic? What is the use case for the envelope pattern? Is there any approach to track data lineage while exporting data from data sources?
1
vote
0 answers

How to display HBase data-lineage in Apache Atlas?

I am testing Apache Atlas data governance tool to display data lineage of a NoSQL database. I understand that HBase is the only supported NoSQL database as of now (input metadata source). I've set up Apache Atlas 2.0 in an environment having…
Lorem
  • 11
  • 1
1
vote
0 answers

Finding lineage of notebooks in azure databricks

I am working on a project where we would be creating many notebooks in Azure databricks. In many cases there is a possibility to nesting notebook calls. We are looking for an approach to create automated lineage across notebooks. Any help or…
Vijay KVS
  • 43
  • 4
1
vote
0 answers

data lineage on Google cloud platform

Data lineage is an important factor in data analytics. I could not find any managed or server-less offering in GCP. Is there any offering in road map or is it left to the implementer Please enlighten me.
1
vote
1 answer

I run the script/tool(import-hive.sh) and i can search the hive entities like tables, database, views, columns,but no lineage, is that nomal?

before install atlas, there are two hive table named atlas_testm and atlas_testm_ext(is a view based on atlas_testm) in my hive database cluster. after install atlas and run the atlas services , i run the script named import-hive.sh,i can saw these…
1
vote
1 answer

SQL Server SSIS Data Lineage

I currently have some standard SSIS packages in SQL Server that load and transform data from CSV files into a SQL Server database. I would like to capture data lineage for these SSIS packages but am unsure how this can be done. Ideally i don't want…
Justin
  • 63
  • 1
  • 5
1
vote
0 answers

SQL Server 2014: column dependencies / lineage

I would like to know which columns of a table or view are part of a column in my current view. For a "basic" version, I used columns for current view, I used sys.views and sys.dm_sql_referenced_entities ... and some other system catalog…
Falko
  • 206
  • 1
  • 8
1
vote
2 answers

Is there a way to track end-to-end data lineage through Neo4j Cypher query?

I'm using Spring-Data along with SpringBoot to populate my Neo4j graph db. I've the following Neo4j entities defined: Source entity --> @NodeEntity public class Source implements Comparable { @GraphId private Long id; private…
lbvirgo
  • 354
  • 1
  • 5
  • 22
1
vote
1 answer

How to implement Data Lineage on Hadoop?

We are implementing few business flows in financial area. The requirement (unfortunately, not very specific) from the regulatory is to have a data lineage for auditing purpose. The flow contains 2 parts: synchronous and asynchronous. The syncronous…
aviad
  • 8,229
  • 9
  • 50
  • 98
0
votes
0 answers

How to convert an arbitrary SQL statement to column level lineage information via an open source solution?

I have SQL statements (various dialects). I want to get column level lineage information for each statement. Example: The statement SELECT A.c1 as c, SUM(B.c2) as c2_sum FROM A JOIN B ON A.c1 = B.c1 leads to something like { "c":…
rwitzel
  • 1,694
  • 17
  • 21