The csv file(valid_file.csv) is as follows
id tech_manager vessel_name vessel_code
1 789 900 1
2 748 73 9
3 564 91.23 15
4 332 52.12 20
The Json file(mainjson.json) is as follows:
{
"id":["decimal"],
"tech_manager":["decimal","int"],
"vessel_name":["decimal"],
"vessel_code":["range(0,10)"]
}
The json file tells that for each column what validation is to be applied .For example for id apply the decimal validation but for tech_manager apply decimal as well as int.
How can I apply range validation on the column vessel_code whose range has been providen in the json file that is min 0 and max 10.
The python code to apply the validation is as follows
def check_decimal(dec):
try:
Decimal(dec)
except InvalidOperation:
return False
return True
VALIDATORS = {
'decimal': CustomElementValidation(lambda d: check_decimal(d), 'is not decimal'),
'int': CustomElementValidation(lambda i: check_int(i), 'is not integer')}
def do_validation():
data = pd.read_csv('valid_file.csv')
with open('mainjson.json', 'r') as my_json:
json_schema = json.load(my_json)
column_list = [Column(k, [VALIDATORS[v] for v in vals]) for k, vals in json_schema.items()]
schema = pandas_schema.Schema(column_list)
errors = schema.validate(data)
pd.DataFrame({'col':errors}).to_csv('erro1.csv')
These code applies the validation from json file on each column and then provides where the validation has not satisfied. In these I am able to apply the decimal validation on the columns .But I am not getting how to apply the range validation on the column. That means if particular column is not in the given range it must display at which index it is not in the range using the json file as as pandas schema.