below code to train csv model and active learning involved init how to integrate console_label function with restful api(eg fastapi)
Create a new deduper object and pass our data model to it.
deduper = dedupe.Dedupe(fields)
# If we have training data saved from a previous run of dedupe,
# look for it and load it in.
# __Note:__ if you want to train from scratch, delete the training_file
if os.path.exists(training_file):
print('reading labeled examples from ', training_file)
with open(training_file, 'rb') as f:
deduper.prepare_training(data_d, f)
else:
deduper.prepare_training(data_d)
# ## Active learning
# Dedupe will find the next pair of records
# it is least certain about and ask you to label them as duplicates
# or not.
# use 'y', 'n' and 'u' keys to flag duplicates
# press 'f' when you are finished
print('starting active labeling...')
dedupe.console_label(deduper)
# Using the examples we just labeled, train the deduper and learn
# blocking predicates
deduper.train()