I am currently working on a concept for an matching algorithm based on huge amount of data. And it's my first time.
That's the case:
- we've got X objects of type "House" with features like size, location and so on
- we have people looking for houses, their search includes size, location and so on
=> we want to match houses to people based on their preferences (size, location, ..)
What's the better approach?
1) Clustering all houses and check to which cluster the person (who wants to buy) belongs to (match people/house with same feature values like size and location) 2) Build a recommender what would also require many people who bought houses in the past in our HDSF
Which technology stack to use for the better approach?
I am currently thinking of: Hadoop/Hive (Storage) - Sqoop (Get data into storage) - Mahout (analysis)
Your help is much appreciated! Thanks in advance!