0

I have a site which is in the stock market domain. The site has a lot of user generated content in terms of forum posts, comments etc.

Also, I have a database table that consists of names of all companies (around 5000) listed in the stock market.

Now, what I want is that if the user has mentioned a company name in comment or forum post, my program will automatically hyperlink it to give the stock price details for it.

Now, the problem is that the user may not use the exact company name as it is available to me in my database. For example, user might write "FB" instead of Facebook or company name without "inc" or "pvt. ltd" in it.

How do I solve this problem? I think since the company database is limited, a machine learning approach would be an overkill. What are your suggestions.

Victor Hurdugaci
  • 28,177
  • 5
  • 87
  • 103
milan m
  • 2,164
  • 3
  • 26
  • 40

1 Answers1

0

The easiest way would be to have multiple possibilities for a company to be stored, e.g. FB will be handled equally to FaceBook.

This can be done in two ways:

1) Increase the list itself (the 5000 items) by adding all alternatives too. This results in a quite bigger database.

2) Create a conversion list where only the conversion is done, e.g. FB->FaceBook etc. Than after using the conversion, the normal existing company database can be used. This splits the responsibility.

You can also experiment with word only parts of names (e.g. Face Book -> FaceBook, or Facebook->FaceBook, xxx.INC->xxx.inc etc).

Michel Keijzers
  • 15,025
  • 28
  • 93
  • 119
  • @MemetOlsen I'm not sure if I understand your comment ... he wants to have alternatives I guess. Of course all alternatives should be chosen 'well'. – Michel Keijzers Oct 26 '15 at 11:02