5

What are the options available to schedule or automate refresh of a quicksight SPICE dataset?

Are there any APIs available to automate spice datatset refresh? preferably using python.

MikA
  • 5,184
  • 5
  • 33
  • 42

1 Answers1

14

You have two options,

- Using API services available in the latest version of boto3

Use 'create_ingestion' method to initiate dataset refresh, and use 'describe_ingestion' to check the status of refresh

import boto3
import time
import sys
client = boto3.client('quicksight')
response = client.create_ingestion(DataSetId='<dataset-id>',IngestionId='<ingetion-id>',AwsAccountId='<aws-account-id>')
while True:
    response = client.describe_ingestion(DataSetId='<dataset-id>',IngestionId='<ingetion-id>',AwsAccountId='<aws-account-id>')
        if response['Ingestion']['IngestionStatus'] in ('INITIALIZED', 'QUEUED', 'RUNNING'):
            time.sleep(10) #change sleep time according to your dataset size
        elif response['Ingestion']['IngestionStatus'] == 'COMPLETED':
            print("refresh completed. RowsIngested {0}, RowsDropped {1}, IngestionTimeInSeconds {2}, IngestionSizeInBytes {3}".format(
                response['Ingestion']['RowInfo']['RowsIngested'],
                response['Ingestion']['RowInfo']['RowsDropped'],
                response['Ingestion']['IngestionTimeInSeconds'],
                response['Ingestion']['IngestionSizeInBytes']))
            break
        else:
            print("refresh failed! - status {0}".format(response['Ingestion']['IngestionStatus']))
            sys.exit(1)

DataSetId of dataset can be found from aws URI or use 'list_data_sets' method to list all datasets and get DataSetId from the field ['DataSetSummaries']['DataSetId'] method call response

IngestionId - set unique id, I used current time in epoch [str(int(time.time()))]

- Schedule refresh using schedule option in quicksight dataset

You can schedule refreshes for 'hourly', 'daily', 'weekly' or 'monthly' cadence using schedule option in quicksight-dataset

enter image description here

MikA
  • 5,184
  • 5
  • 33
  • 42
  • 2
    But I guess there's no option to programmatically configure the "Schedule refresh" operation. Neither in the API to create a DataSet nor in the Create Ingestion. – Aashish gaba Mar 24 '22 at 10:27
  • 1
    The above code worked 100% for me just add commend : DataSetId of the dataset can be found from AWS URI or use the following AWS cli command to get all the list of datasets (data source ) : - > aws quicksight list-data-sources --aws-account-id xxxx5699xx Account Id: xxxx5699xx we need to pass this parameter – Gautam Jan 20 '23 at 06:03