I have a list of names of medicines suppose(crocin,seroflo,oxitab,etc).The list is very long. Now suppose I need to find whether a particular medicine is present or not in the list,but also there could be typo errors.supposing I intended to find crocin in the list,but i instead type crosin.I want the machine learning algorithm to overcome this typographical error of mine and for small differences like crocin and crosin, it should return as match found
-
1Here at Stack Overflow, code is favored over a link to a website, because once the link has changed, the question will no longer have historical value. Visit [here](http://stackoverflow.com/editing-help) for help with formatting code into your question. – Cody Guldner Aug 23 '13 at 03:07
2 Answers
I don't think you need machine learning a simple edit distance algorithm should do that.

- 12,828
- 8
- 49
- 67
-
2I had to downvote this. Taking a look at the following resources - http://eprints.whiterose.ac.uk/884/1/hodgevj10.pdf http://research.microsoft.com/pubs/68884/spell-correct-acl02.pdf http://research.ihost.com/and2007/cd/Proceedings_files/p79.pdf http://acl.ldc.upenn.edu/acl2002/MAIN/pdfs/Main336.pdf – Kumar Vaibhav Mar 26 '14 at 13:51
-
1Also look at - http://books.google.co.in/books?id=7Mx1Y7bMSiAC&pg=PA384&lpg=PA384&dq=spell+correction+using+neural+networks&source=bl&ots=bPFUmEROUp&sig=H7MjHqdCOs1U_MxaP86Jx-JA4y0&hl=en&sa=X&ei=TcgyU_S_M4OnrAe7sYDwBQ&ved=0CHUQ6AEwBw#v=onepage&q=spell%20correction%20using%20neural%20networks&f=false http://delivery.acm.org/10.1145/1080000/1075255/p286-brill.pdf?ip=115.111.134.66&id=1075255&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&CFID=427131815&CFTOKEN=90074681&__acm__=1395837917_5a6a5e0725f8a9b61a970224a86f8371 – Kumar Vaibhav Mar 26 '14 at 13:56
-
2There are many more resources for doing spell correction using ML! – Kumar Vaibhav Mar 26 '14 at 13:57
I agree the necessity of using ML methods is doubtful. But if you really want to using learning-based method for "spelling correction" (I am not sure if this works well for medicine names), you can refer papers below:
A winnow-based approach to context-sensitive spelling correction
An improved error model for noisy channel spelling correction
A large scale ranker-based system for search query spelling correction
A discriminative model for query spelling correction with latent structural SVM
A Graph Approach to Spelling Correction in Domain-Centric Search.
And this paper is about correction for person names:
Hashing-based approaches to spelling correction of personal names

- 462
- 5
- 13
-
can any database could be used to solve this. supposing i put the two tables as regular_list and new _list. then through a query could i compare the above mentioned distance and let the database return the solution accordingly???? – rohit Aug 30 '13 at 15:32