Questions tagged [python-dedupe]

Questions about the dedupe python library (a library for probabilistic deduplication and record linkage)

Dedupe is an open source, Python library for probabilistic dedupliction, record linkage, and entity resolution.

67 questions
0
votes
2 answers

Python Dedupe Package Error: "Records do not line up with data model". But everything looks OK

I'm following various tutorials for python dedupe online, but keep coming across this error whichever one I try: ValueError: Records do not line up with data model. The field 'firstname ' is in data_model but not in a record Somebody on their…
SCool
  • 3,104
  • 4
  • 21
  • 49
0
votes
1 answer

No module named zope.index

Getting the following error when trying to import a lib that depends on zope No module named zope.index my python path is correct (I can import other libs) I already created an init.py file in the zope folder but it still isnt working so Im not sure…
0
votes
1 answer

Apache Nifi - Federated Search

My team’s been thrown into the deep end and have been asked to build a federated search of customers over a variety of large datasets which hold varying degrees of differing data about each individuals (and no matching identifiers) and I was…
0
votes
0 answers

Installing Specific Version of Numpy for Dedupe Error

I am new to python and already running into some issues: To clean some data i wanted to try dedupe / csvdedupe It needed numpy to run, so i installed it (worked without a problem) pip install numpy pip install dedupe but when i wanted to install it…
Mally
  • 19
  • 4
0
votes
0 answers

dedupe OverflowError on record linkage

I want to use Dedupe library for record linkage. I wrote this code from Dedupe examples on Github. But when i run my code i get this error : OverflowError: Python int too large to convert to C ssize_t ## its because my data are very big.how i…
Dr Sima
  • 135
  • 1
  • 12
0
votes
1 answer

Values are not inserted into MySQL table using pool.apply_async in python2.7

I am trying to run the following code to populate a table in parallel for a certain application. First the following function is defined which is supposed to connect to my db and execute the sql command with the values given (to insert into…
-3
votes
1 answer

Error while Installing dedupe conda package on windows

Error while installing dedupe package : Please help me to solve this error: After issuing conda install -c derickl dedupe I have received a PackagesNotFoundError on windows 10.
1 2 3 4
5