When I create a datasource
via the python script below, at least one of my attributes has 100% missing values. When I manually create the datasource
via the AWS ML dashboard, and apply the same attribute types, none of the values are missing. Is there a problem with how I'm creating the datasource from s3?
file_names = [file_name_train, file_name_testing]
client = boto3.client('machinelearning')
schema_file = open('../Selections/aws_schema.txt', 'r')
schema = schema_file.read()
for file_name in file_names:
response = client.create_data_source_from_s3(
DataSourceId=file_name+date,
DataSourceName=file_name+date,
DataSpec={
'DataLocationS3': 's3://'+bucket_name+'/'+file_name+file_extension,
'DataSchema': schema,
},
ComputeStatistics=True
)