2

The csv file(valid_file.csv) is as follows

id   tech_manager   vessel_name   vessel_code
1      789           900             1
2      748           73              9
3      564           91.23           15
4      332           52.12           20

The Json file(mainjson.json) is as follows:

{
 "id":["decimal"],
 "tech_manager":["decimal","int"],
 "vessel_name":["decimal"],
 "vessel_code":["range(0,10)"]
}

The json file tells that for each column what validation is to be applied .For example for id apply the decimal validation but for tech_manager apply decimal as well as int.

How can I apply range validation on the column vessel_code whose range has been providen in the json file that is min 0 and max 10.

The python code to apply the validation is as follows

def check_decimal(dec):
try:
    Decimal(dec)
except InvalidOperation:
    return False
return True

VALIDATORS = {
'decimal': CustomElementValidation(lambda d: check_decimal(d), 'is not decimal'),
'int': CustomElementValidation(lambda i: check_int(i), 'is not integer')}

def do_validation():
data = pd.read_csv('valid_file.csv')
with open('mainjson.json', 'r') as my_json:
    json_schema = json.load(my_json)

column_list = [Column(k, [VALIDATORS[v] for v in vals]) for k, vals in json_schema.items()]
schema = pandas_schema.Schema(column_list)
errors = schema.validate(data)
pd.DataFrame({'col':errors}).to_csv('erro1.csv')

These code applies the validation from json file on each column and then provides where the validation has not satisfied. In these I am able to apply the decimal validation on the columns .But I am not getting how to apply the range validation on the column. That means if particular column is not in the given range it must display at which index it is not in the range using the json file as as pandas schema.

arpita
  • 51
  • 5

0 Answers0