Questions tagged [soundex]

Soundex is an phonetic algorithm for indexing names based on their pronunciation in spoken English.

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.

Soundex is the most widely known of all phonetic algorithms mainly because it is a standard feature of popular database software (such as MySQL, MS SQL Server and Oracle) and some programming languages (such as PHP).

Soundex was developed by Robert C. Russell and Margaret K. Odell and patented in 1918 and 1922

Articles

159 questions
4
votes
0 answers

Autocorrecting all misspelled text data in R

So I have been searching for a long time on methods to correct typos in text in R, without manually adding/replacing words. I have data in text format that is the patients' complaints in an emergency department. After performing a simple Random…
Diana01
  • 183
  • 1
  • 1
  • 10
4
votes
4 answers

Intelligent web features, algorithms (people you may follow, similar to you ...)

I have 3 main questions about the algorithms in intelligent web (web 2.0) Here the book I'm reading http://www.amazon.com/Algorithms-Intelligent-Web-Haralambos-Marmanis/dp/1933988665 and I want to learn the algorithms in deeper 1. People You may…
user467871
4
votes
1 answer

Soundex and checking for invalid sound

Is it ok to check for SQL returning a Soundex of 0000 based on the assumption that it isn't a valid word, e.g. has digits, spaces, special characters or is there a better way to do this?
jaffa
  • 26,770
  • 50
  • 178
  • 289
4
votes
1 answer

Output of Soundex algorithm implementation is wrong for cases - "Tymczak" and "Pfister"

When I tested the algorithm Soundex according to the Wikipedia article on Soundex, I found that Tymczak returned T520, not T522 and Pfister returned P123, not P236. I have no idea why the output is not correct. My code: using System; using…
Bav
  • 139
  • 2
  • 3
  • 14
4
votes
1 answer

Replace words using Soundex, python

i have a list of sentences and basically my aim is to replace all diff occurrences of prepositions in the form "opp,nr,off,abv,behnd" with their correct spellings "opposite,near,above,behind" and so on. The soundex code of the words are same so i…
Hypothetical Ninja
  • 3,920
  • 13
  • 49
  • 75
4
votes
2 answers

PHP/MySQL: Highlight "SOUNDS LIKE" query results

Quick MYSQL/PHP question. I'm using a "not-so-strict" search query as a fallback if no results are found with a normal search query, to the tune of: foreach($find_array as $word) { clauses[] = "(firstname SOUNDS LIKE '$word%' OR lastname SOUNDS…
Greg
  • 7,782
  • 7
  • 43
  • 69
4
votes
1 answer

Is there a "Sounds-Like" string matching algorithm implemented in Dutch?

I know about the Soundex and Double-Methaphone algorithms for "sounds-like" stringmatching in English. Where can I find a similar algorithm, or a port of one of the algorithms for the Dutch Language?
Paco
  • 8,335
  • 3
  • 30
  • 41
4
votes
1 answer

How to search for Soundex() substrings in MySQL?

i got a problem with the Joomla! 3 integrated search engine. This engine's indexer creates so called soundex-values when indexing content like, for example Testobject, Testobject 1, Testobject 2239923, Textobject .... which all have the same…
user1014412
  • 359
  • 1
  • 2
  • 15
3
votes
2 answers

How can I tokenize a string in MySQL?

My project is importing a sizable collection +500K rows of data from flat Excel files, which are manually created by a team of people. Now the problem is that it all needs to be normalized, for client searching. For example, the company field will…
3
votes
4 answers

Soundex Algorithm implementation using C++

Put simply a Soundex Algorithm changes a series of characters into a code. Characters that produce the same Soundex code are said to sound the same. The code is 4 characters wide The first character of the code is always the first character of the…
user77482
3
votes
2 answers

Implementing Soundex in Java

Please help me to implement string similarity comparison in java! Using org.apache.commons.codec.language.Soundex library Soundex soundex = new Soundex(); String phoneticValue = soundex.encode("YourString"); String phoneticValue2 =…
anduplats
  • 885
  • 2
  • 14
  • 24
3
votes
4 answers

SQL Server's SoundEx function on non-Latin character sets?

Does SQL Server's (2000) Soundex function work on Asian character sets? I used it in a query and it appears to have not worked properly but I realize that it could be because I don't know how to read Chinese... Furthermore, are there any other…
Frank V
  • 25,141
  • 34
  • 106
  • 144
3
votes
6 answers

Data structure for soundex algorithm?

Can anyone suggest me on what data structure to use for a soundex algorithm program? The language to be used is Java. If anybody has worked on this before in Java. The program should have these features: be able to read about 50,000 words should…
javac
  • 5,651
  • 4
  • 18
  • 6
3
votes
1 answer

Optimizing a Soundex Query for finding similar names

My application will offer a list of suggestions for English names that "sound like" a given typed name. The query will need to be optimized and return results as quick as possible. Which option would be most optimal for returning results quickly.…
xkingpin
  • 631
  • 8
  • 16
3
votes
2 answers

Is it possible to use Soundex (or other SQL functions) in LinqToSql?

I'm refactoring some code currently implemented in stored procedures to use LinqToSql (for use in training). Is it possible to use SQL functions in a linqToSql Query?
Danimal
  • 7,672
  • 8
  • 47
  • 57
1 2
3
10 11