0

Let us take the following MYSQL query written in BigQuery for creating a view.

SELECT a.col1,
       a.col2,
       a.col3,
       b.col1,
       b.col2,
       b.col3,
       c.col1,
       c.col2
FROM project_name_dataset_table_a a
INNER JOIN project_name_dataset_table_b b ON a.col1 = b.col1
INNER JOIN project_name_dataset_table_c c ON a.col2 = c.col2

Below is the required output format or similar. Final Format Required or any similar will be fine

Basically i have to prepare a document which will contain all the information of the views created and the tables and their respective columns used in it.

Danish Bansal
  • 608
  • 1
  • 7
  • 25
  • I hope what you are looking for is answered here https://stackoverflow.com/questions/69225843/dbc-columns-in-teradata-vs-information-schema-columns-in-bigquery/69234050?noredirect=1#comment122458006_69234050 – Mr.Batra Sep 23 '21 at 07:58
  • No in that case I have to run the query actually. But I wanted a solution which just convert my query string to desired results – Danish Bansal Sep 23 '21 at 08:13

2 Answers2

0

I have created a script for the usage at my end you can tweak the regex according to you

import re

query = """
SELECT a.col1, a.col2, a.col3, b.col1, b.col2, b.col3, c.col1, c.col2 
FROM `project_name.dataset.table_a` a
JOIN `project_name.dataset.table_b` b
ON ...
JOIN `project_name.dataset.table_c` c
ON ...
"""
tabsandcols = re.findall(r'([A-Za-z0-9\-\_]*)\.([a-zA-Z0-9\.\_]*)', query)

alias_names = re.findall(r'\`([A-Za-z\-\_]*)\.([a-zA-Z\.\_]*)\` ([a-z]{1,3})', query)

dic = {}

print(alias_names)

for i, j, k in alias_names:
    if k in dic:
        pass
    else:
        dic[k] = j

l = set()
for i, j in tabsandcols:
    if i != "project_name" and dic.get(i, "") != "":
        l.add(dic.get(i, "") + " " + j)

for i in l:
    print("project_name", i)

It will provide following output

project_name dataset.table_a col1
project_name dataset.table_b col2
project_name dataset.table_c col1
project_name dataset.table_c col2
project_name dataset.table_a col3
project_name dataset.table_a col2
project_name dataset.table_b col1
project_name dataset.table_b col3
Danish Bansal
  • 608
  • 1
  • 7
  • 25
0

I could replicate the code. I suggest that you use this command from BigQuery (INFORMATION_SCHEMA.COLUMNS). This command brings you the information that you need.

This is a sample code:

SELECT A.table_catalog, CONCAT(A.table_schema, '.' , A.table_name) as table,A.column_name
FROM myproject_name.mydataset_name.INFORMATION_SCHEMA.COLUMNS A
where A.table_catalog=myproject_name

If you need more information about this command you can click here:

Raul Saucedo
  • 1,614
  • 1
  • 4
  • 13