0

Now, I have an interesting situation... My goal is to have a software which accepts a person's full name, date of birth, and some other credentials (which they are, is debatable) and extract as much information about him from the internet as possible...

Now I have done some research and found that by using google's search API and a web crawler such as Scrapy I can achieve this goal to some extent... But simply searching people's names on google in double quotes doesn't always yield the right result...

Two questions come to mind here... How to increase the accuracy, and secondly, am I re-inventing the wheel (as there are some sites which have the ability to find people...)? If so then is there already open source code (or anything usable) out there which does this or something similar to it...?

ArslanW
  • 353
  • 1
  • 10

1 Answers1

1

This answer is related to how can be use a scraper over a large quantities of URLs. For instance, you can start with SmokeDoc.

Mihai8
  • 3,113
  • 1
  • 21
  • 31
  • Thanks for the feedback and suggestion; I will look into it... but My biggest issue is to search information about people accurately... – ArslanW Mar 01 '13 at 07:32
  • Algorithm that will suit a large number of attributes will determine the accuracy of search. – Mihai8 Mar 01 '13 at 10:57