0

I'm trying to build a knowledge base in SQL Server 2012 DQS. In the raw data that I am trying to run through DQS, I've got a field that is a full name field that unfortunately can hold both human and company names. I was wondering if anyone had advice on how to cleanse and parse this field using DQS without having to purchase a reference data service so that if its a company name, it puts the whole field into the FirstName or LastName domain but if its not a company name then it parses the whole name out into the appropriate domains.

I set up a composite domain named FullName which is made up of FirstName, MiddleName and LastName domains and mapped it to my raw data. I then used the composite domain knowledge based parsing to try to parse these values into their respective fields but unfortunately it tries to parse company names as if they were people so I get a lot of last names like "& Associates Inc", etc.

It seems like the easiest way to tackle this would be to create a domain of business words and have a composite domain rule that basically says, if any of the words in the FullName domain are in the BusinessWord domain then put the whole name in the FirstName domain, else try to parse the name into first, middle, last. Is this type of lookup beyond the capabilities of DQS?

So far the only thing i can find for DQS CD rules is to create rules that are hardcoded, i.e. if city Is "London" then change country to "England", but I would like a more dynamic rule that says if BusinessWord In FullName then put the whole thing in FirstName. Just wondering if anyone who is trying out DQS has come across a problem like this and if its solvable without having to pay a ridiculous amount of money for a reference data service.

If you can't clean names and addresses with this thing then what is it good for?

Thanks in advance.

IT ppl
  • 2,626
  • 1
  • 39
  • 56
DrRocket
  • 85
  • 1
  • 5

1 Answers1

0

I would tackle this by using the KB in a DQS task as part of an SSIS workflow. This blog post has a reference on how to use the DQS task. For your situation I would use the output of the status column (define a rule) and then redirect the output as per your business rules via additional SSIS tasks.

I know this is not ideal, however I think it would solve the immediate task at hand.

Lynn Langit
  • 4,030
  • 1
  • 23
  • 31