Let's say I have a database of books that includes their titles. For a given listing from eBay or Craigslist or some other such site, I want to compare its title string to all of the book titles in my database to try to find a match.
It's unlikely there will ever be exact string equality as users on those sites like to include things like "perfect condition" and "fast shipping" to their listing titles to attract buyers.
What algorithm(s) should I use to do this type of correlation? I'm aware of n-grams and Levenshtein distance, but I don't know which would do the most accurate job.
For the various applicable algorithms, how does their computational performance compare? Would it make sense to use multiple algorithms and average their results to balance their strengths and weaknesses? Would it be possible to set a minimum level of confidence? I'd rather have no match than a very poor quality match.