0

I am trying to create a data quality validation for set of files in s3. For that I have chose AWS data brew and have created a dataset, data quality rules and a data profile job via SAM template. Here, Once a dataset is created I have to refer the Arn of the dataset while creating the ruleset and also the Arn of ruleset for the profile job. On checking documentation I can see that ARN is not part of outputs for the dataset and data quality rule set. So is it possible to dynamically refer these values. Or should I create rulesets separately.

SampleDataSet:
 Type: AWS::DataBrew::Dataset
 Properties:
  Name: SampleDataSet
  Input:
    S3InputDefinition:
      Bucket: *****
      Key: *****

SampleRuleSet:
 Type: AWS::DataBrew::Ruleset
 Properties:
  Name: SampleRuleSet
  Rules:
    - Name: rule1
      Disabled : true
      CheckExpression: "AGG(DUPLICATE_ROWS_COUNT) <= :val1"
      SubstitutionMap:
        - Value: "0"
          ValueReference: ":val1"
      TargetArn: !GetAtt SampleDataSet.Arn
  DependsOn: SampleDataSet

SampleProfileJob:
 Type: AWS::DataBrew::Job
 Properties:
  Name: SampleProfileJob
  Type: PROFILE
  RoleArn: !GetAtt GenericDataBrewDataQualityRole.Arn
  DatasetName: SampleDataSet
  Timeout: 5
  ValidationConfigurations:
    - RulesetArn: !GetAtt SampleRuleSet.Arn
  OutputLocation:
    Bucket: *****
 DependsOn: SampleRuleSet

0 Answers0