Questions tagged [apache-atlas]

Apache Atlas is a data governance and metadata framework for Hadoop. Use for questions about setting up Atlas, the REST APIs, bridges, or problems encountered using Atlas.

Data Governance and Metadata framework for Hadoop

Features

  • Data Classification

Import or define taxonomy business-oriented annotations for data Define, annotate, and automate capture of relationships between data sets and underlying elements including source, target, and derivation processes Export metadata to third-party systems

  • Centralized Auditing

Capture security access information for every application, process, and interaction with data Capture the operational information for execution, steps, and activities

  • Search & Lineage (Browse)

Pre-defined navigation paths to explore the data classification and audit information Text-based search features locates relevant data and audit event across Data Lake quickly and accurately Browse visualization of data set lineage allowing users to drill-down into operational, security, and provenance related information

  • Security & Policy Engine

Rationalize compliance policy at runtime based on data classification schemes, attributes and roles. Advanced definition of policies for preventing data derivation based on classification (i.e. re-identification) – Prohibitions Column and Row level masking based on cell values and attibutes.

References:

107 questions
0
votes
2 answers

How to plug in a process of identifying sensitive information somewhere in ETL pipeline?

Hope you are doing well ! We have already developed ETL pipeline using apache NiFi. Which gets trigger only when client uploads source data file from portal.After that, the data present inside source file goes through various layers,gets transformed…
Manoj Dhake
  • 227
  • 4
  • 16
0
votes
1 answer

how to define a field in grok regex fluentd

i have a below apache atlas audit logs: [INFO] 2020-06-29 15:14:31,732 AUDIT logJSON - {"repoType":15,"repo":"atlas","reqUser":"varun","evtTime":"2020-06-29…
chitender kumar
  • 394
  • 4
  • 21
0
votes
1 answer

How do I save lineage info in Apache Atlas when using Apache Cassandra and Elasticsearch

I am planning to deploy Apache Atlas using Apache Cassandra as a storage backend and Elasticsearch as an index backend. I am wondering how I can save lineage info with this? It provides get API to get the lineage info but seems to have no way to…
Michael Scott
  • 540
  • 2
  • 8
0
votes
2 answers

How to find credentials of Sandbox HDP-3.0.1 Atlas installed in Docker

https://www.cloudera.com/tutorials/sandbox-deployment-and-install-guide/3.html I am following above reference and get HDP installed in linux docker. Most of the services are running. Able to log into Ambari, Ranger as admin and raj_ops respectively…
Irshad Ali
  • 1,153
  • 1
  • 13
  • 39
0
votes
2 answers

In Apache Atlas, is there a way to delete/clean soft deleted entities after enabling hard delete?

We used to have soft delete and recently enabled hard delete in atlas 1.1. Now we are trying to clean up the soft deleted entities via delete by guid api and not able to clear those. Is there a way to delete/clean soft deleted entities after…
Ravi
  • 1
  • 2
0
votes
0 answers

Session 0x0 for server null when starting Atlas

I just installed Atlas in HDP 2.6.3 and the start up of Atlas server gave below error: /var/log/atlas/application.log 2019-12-17 23:41:30,446 INFO - [main-SendThread(1:2181):] ~ Opening socket connection to server 1/0.0.0.1:2181. Will not attempt…
HP.
  • 19,226
  • 53
  • 154
  • 253
0
votes
2 answers

While I'm trying to run apache atlas. I'm facing some hbase error(I'm using embedded hbase and solr)

My apache atlas server is started but I found errors in my application.log file. ui for apache atlas is also not running. I've followed each and every step from apache website. All went good. I gave all permissions in atlas-env.sh and…
0
votes
3 answers

Connecting to apache atlas + hbase + solr setup with gremlin cli

I am new to atlas and janusgraph, I have a local setup of atlas with hbase and solr as the backends with dummy data. I would like to use gremlin cli + gremlin server and connect to the existing data in hbase. ie: view and traverse the dummy atlas…
druuu
  • 1,676
  • 6
  • 19
  • 36
0
votes
4 answers

Can't setup spark application with spark-atlas-connector

Can't setup my spark application with apache atlas via spark-atlas-connector . I had clone https://github.com/hortonworks-spark/spark-atlas-connector project and executed mvn package. Then I put all jars in my project and setup config like this: def…
Dave
  • 507
  • 7
  • 22
0
votes
1 answer

What is the proper endpoint for HDP3.1 Atlas REST API?

Using Atlas v1.1 with HDP 3.1 and appear unable to access the api endpoint for making requests related to relationship characteristics. From the docs (here (for API access) and here (for specific endpoint)), I would think to do something…
lampShadesDrifter
  • 3,925
  • 8
  • 40
  • 102
0
votes
2 answers

Unable to delete Apache Atlas Classification

I have a list of Classifications & Sub-classifications in Apache Atlas. Want to delete them & create a new list. All the other classifications are getting deleted but one of them with name "PII" giving following error when we select Delete…
ankitbaldua
  • 263
  • 4
  • 14
0
votes
0 answers

Error connecting to the backend of apache atlas

I try to manage Apache Atlas APIs using WSO2 API Manager, when trying a get request like this for example : http://{address_IP:port}/atlas/2.0.0-SNAPSHOT/v2/entity/bulk Postman gives 101503 error connecting to the backend I just figured out…
0
votes
2 answers

How to scale out apache atlas

There is no info provided in atlas document on how to scale it. Apache atlas is connected to cassandra or hbase in the backend which can scale out ,but I dont know how apache atlas engine ( rest web-service and request processor ) can scale out. I…
abhi
  • 141
  • 8
0
votes
1 answer

Lineage is not visible for Hive Managed Table in HDP Atlas

I am using Atlas with HDP for creating the lineage flow for my hive tables but the lineage is only visible for the Hive External tables. I have created hive managed tables and perform a join operation to create a new table and imported the hive meta…
AbhinavVaidya8
  • 522
  • 6
  • 18
0
votes
1 answer

apache atlas use in cdh,when import hive metadata

root> sh import-hive.sh Using Hive configuration directory…
supersujj
  • 11
  • 1
  • 4