1

I have a list of names of medicines suppose(crocin,seroflo,oxitab,etc).The list is very long. Now suppose I need to find whether a particular medicine is present or not in the list,but also there could be typo errors.supposing I intended to find crocin in the list,but i instead type crosin.I want the machine learning algorithm to overcome this typographical error of mine and for small differences like crocin and crosin, it should return as match found

rohit
  • 47
  • 2
  • 10
  • 1
    Here at Stack Overflow, code is favored over a link to a website, because once the link has changed, the question will no longer have historical value. Visit [here](http://stackoverflow.com/editing-help) for help with formatting code into your question. – Cody Guldner Aug 23 '13 at 03:07

2 Answers2

5

I don't think you need machine learning a simple edit distance algorithm should do that.

https://en.wikipedia.org/wiki/Edit_distance

Matti Lyra
  • 12,828
  • 8
  • 49
  • 67
  • 2
    I had to downvote this. Taking a look at the following resources - http://eprints.whiterose.ac.uk/884/1/hodgevj10.pdf http://research.microsoft.com/pubs/68884/spell-correct-acl02.pdf http://research.ihost.com/and2007/cd/Proceedings_files/p79.pdf http://acl.ldc.upenn.edu/acl2002/MAIN/pdfs/Main336.pdf – Kumar Vaibhav Mar 26 '14 at 13:51
  • 1
    Also look at - http://books.google.co.in/books?id=7Mx1Y7bMSiAC&pg=PA384&lpg=PA384&dq=spell+correction+using+neural+networks&source=bl&ots=bPFUmEROUp&sig=H7MjHqdCOs1U_MxaP86Jx-JA4y0&hl=en&sa=X&ei=TcgyU_S_M4OnrAe7sYDwBQ&ved=0CHUQ6AEwBw#v=onepage&q=spell%20correction%20using%20neural%20networks&f=false http://delivery.acm.org/10.1145/1080000/1075255/p286-brill.pdf?ip=115.111.134.66&id=1075255&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&CFID=427131815&CFTOKEN=90074681&__acm__=1395837917_5a6a5e0725f8a9b61a970224a86f8371 – Kumar Vaibhav Mar 26 '14 at 13:56
  • 2
    There are many more resources for doing spell correction using ML! – Kumar Vaibhav Mar 26 '14 at 13:57
1

I agree the necessity of using ML methods is doubtful. But if you really want to using learning-based method for "spelling correction" (I am not sure if this works well for medicine names), you can refer papers below:

A winnow-based approach to context-sensitive spelling correction

An improved error model for noisy channel spelling correction

A large scale ranker-based system for search query spelling correction

A discriminative model for query spelling correction with latent structural SVM

A Graph Approach to Spelling Correction in Domain-Centric Search.

And this paper is about correction for person names:

Hashing-based approaches to spelling correction of personal names

dragonxlwang
  • 462
  • 5
  • 13
  • can any database could be used to solve this. supposing i put the two tables as regular_list and new _list. then through a query could i compare the above mentioned distance and let the database return the solution accordingly???? – rohit Aug 30 '13 at 15:32