0

This question is about the Data Catalog of AWS Glue.

I want to build a process like this:

Connect Github to AWS Glue Data Catalog -> Pull Request about data catalog code(source) -> Merge -> Reflecting Modified Code in the AWS Glue Data Catalog -> The changed Data Catalog information is created by Markdown. Or update information in Confluence

The purpose of this work is to make the Data Catalog readable by non-developers.

Is this possible? What literature should I read? Any advice is welcome! Help!!

J184937
  • 67
  • 2
  • 8
  • You have to explain it in better way. You may read some glue and git documents to use right technical words. Descripting is confusing. Glue is ETL, Github is repository and Data Catalog is not source code but contains metadata which is stored/managed by AWS. At-most, you may create/update/delete databases, tables in Data Catalog but can't modify Data Catalog. – Sandeep Fatangare Oct 15 '19 at 16:48
  • @SandeepFatangare My question is not enough. Sorry. In addition, I need a way for AWS Glue Data Catalog to show non-developers the contents of a crawler-generated Data Catalog. I need to show non-developers the names, descriptions, and attributes of the columns defined in the Data Catalog. I can’t open an account to them every time, so I need to automate it so they can see it from the outside. I heard that the data catalog is also a table, is it possible to access it externally? – J184937 Oct 16 '19 at 02:39

1 Answers1

2

Option 1: You can use boto3 glue APIs to retrieve information about tables - get_table or get_tables()

You may refer https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Client.get_tables It also contains usage and response examples.

Once response is received, you may show it in web-page.

Advantage: Non-tech person can access without any setup

Disadvatange: Developer have to write code

Option 2: Use AWS CLI command tool. Link: https://docs.aws.amazon.com/cli/latest/reference/glue/get-table.html

Advantage: No code needed from developer

Disadvantage: Client should know how to setup and use AWS CLI commands and their output.

Sandeep Fatangare
  • 2,054
  • 9
  • 14