0

When I create a datasource via the python script below, at least one of my attributes has 100% missing values. When I manually create the datasource via the AWS ML dashboard, and apply the same attribute types, none of the values are missing. Is there a problem with how I'm creating the datasource from s3?

file_names = [file_name_train, file_name_testing]    
client = boto3.client('machinelearning')

schema_file = open('../Selections/aws_schema.txt', 'r')
schema = schema_file.read()

for file_name in file_names:

    response = client.create_data_source_from_s3(
        DataSourceId=file_name+date,
        DataSourceName=file_name+date,
        DataSpec={
            'DataLocationS3': 's3://'+bucket_name+'/'+file_name+file_extension,
            'DataSchema': schema,
        },
        ComputeStatistics=True
    )
Jason Brown
  • 284
  • 6
  • 19

0 Answers0