-3

I have a flatten table. I want to convert it to a nested json format.

database     schema table_name          col_name

exploration  dbo    TestTable           name
exploration  dbo    TestTable           last_name
exploration  dbo    Table_1             name
exploration  dbo    Table_1             d_id
exploration  dbo    tblMonitorChange    EventType
exploration  dbo    tblMonitorChange    SchemaName
exploration  dbo    tblMonitorChange    ObjectName
exploration  dbo    tblMonitorChange    ObjectType
exploration  dbo    tblMonitorChange    EventDate
exploration  dbo    tblMonitorChange    SystemUser
exploration  dbo    tblMonitorChange    CurrentUser
exploration  dbo    tblMonitorChange    OriginalUser
ReportServer dbo    Users               UserID
ReportServer dbo    Users               Sid
ReportServer dbo    Users               UserType
ReportServer dbo    Users               AuthType
ReportServer dbo    Users               UserName
ReportServer dbo    Users               ServiceToken
ReportServer dbo    Users               Setting

i am looking for a general solution, not hard coding column names.

any help would be appreciated.

1 Answers1

0

i solved the problem recursively. this functions tested and works:

def table_to_json(model):
    print('-------------------------------------------------------------')
    doc = {};
    col_names = list(model.columns)
    grouped = model.groupby(col_names[0])[col_names[1]]
    values = grouped.apply(lambda x: set(x.tolist()))
    a = values.shape
    if(len(col_names)==2):
        return dict(values)
    keys = list(grouped.groups.keys())
    for k in keys:
        doc.update({k:table_to_json(model[model[col_names[0]] == k][col_names[1:]])})

    return doc

it takes a dataframe ( check it in my first post ) as input, and returns a dictionary document as output. this is a part of the output:

{'exploration': {'dbo': {'Table_1': {'d_id', 'name'},
   'TestTable': {'last_name', 'name'},
   'tblMonitorChange': {'CurrentUser',
    'EventDate',
    'EventType',
    'ObjectName',
    'ObjectType',
    'OriginalUser',
    'SchemaName',
    'SystemUser'}}},
'ReportServer': {'dbo': {'ActiveSubscriptions': {'ActiveID',
    'SubscriptionID',
    'TotalFailures',
    'TotalNotifications',
    'TotalSuccesses'},
    'Batch': {'Action',
    'AddedOn',
    'BatchID',
    'BoolParam',
    'Content',
    'Item',
    'Param',
    'Parent',
    'Properties'}}}}