0

I am trying to do fuzzy logic on an Excel spreadsheet for two data sources exported from Microsoft CRM. One is Account data, and the other one is Lead's data.

Account Sheet contains - All the existing customers. Lead's Sheet is a 'dirty sheet' and contains - All the existing customers + leads.

I would like to match up both the sheets together using fuzzy logic, as exact match won't work for my case, and thus filtering out all the 'actual leads' from the Lead's sheet.

Example:

Accounts Sheet -

Courtyard by Marriott-Calgary
Courtyard Middletown
Granite Broadway Development LLC
Infinity formerly known as Residence inn
Inversiones Hoteleras CH de Escazú S.A
Marriott Chicago O'Hare Hotel
Marriott Residence Inn
Marriott Residence Inn
Marriott Residence Inn

Lead's Sheet -

Residence Inn by Marriott
Residence Inn and Courtyard by Marriott Manhattan
Residence Inn by Marriott Edinburgh
Residence Inn
Residence Inn Bellevue
Residence Inn Phoenix Glendale
Residence Inn Phoenix Glendale
Residence Inn Phoenix Glendale
Residence Inn Phoenix Glendale

In this example I have only given the company name column, but actually I will be performing this on set of columns(Company name, address 1, address 2 etc.)

Is there any way I can accomplish this using DataNitro?

hky404
  • 1,039
  • 3
  • 17
  • 35
  • 1
    What exactly is the question, and how is this particularly relevant to DataNitro? There are no DN functions for doing fuzzy string matching, but you can import any python library into your DN script. If you can be more specific about the question perhaps I can add something useful. In the meantime I suggest you check out the jellyfish library for string matching. – Woody Pride May 04 '14 at 02:38
  • I am looking to do a fuzzy match on a group of columns against another group of columns in Excel, this question is related to DataNitro because I want to do this in Excel using Python. Thanks for the info about the JellyFish library for the string matching, I am gonna have a look at it. – hky404 May 05 '14 at 18:59
  • 1
    In any event I would read the data into a pandas object or similar, don't work directly in excel, you can then paste the matched values using DN. I prepared three notebooks for reference to Jellyfish as documentation is scarce: https://programtheworld.wordpress.com/python/ – Woody Pride May 06 '14 at 00:04
  • Thanks a lot woody, I am gonna follow that procedure, and will post if I have any questions regarding the same. – hky404 May 06 '14 at 18:35

0 Answers0