1

As you can see I have set my values as "SMITH" and "SMYTHE" within my main method. The output of this value should be 25030 but for some reason it is encoding as 250300. I think this is because it is doing the encoding prior to what first character the word is. e.g. SMITH is "S" so this is encoding as the first character of "S". How do I make that S become a digit or a value?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace SoundDexFinal
{
    class Program
    {
        static void Main(string[] args)
        {

            string value1 = "SMITH";
            string value2 = "Smythe";

            soundex soundex = new soundex();
            Console.WriteLine(soundex.GetSoundex(value1));      // Outputs "S50300"
            Console.WriteLine(soundex.GetSoundex(value2));      // Outputs "S530"
            Console.WriteLine(soundex.Compare(value1, value2)); // Outputs "4"
            Console.ReadLine();
        }
    }

        namespace SoundDexFinal
    {
        class soundex
        {
            public string GetSoundex(string value)
            {
                value = value.ToUpper();
                StringBuilder soundex = new StringBuilder();
                foreach (char ch in value)
                {
                    if (char.IsLetter(ch))
                        AddCharacter(soundex, ch);

                }
                RemovePlaceholders(soundex);
                FixLength(soundex);
                return soundex.ToString();

            }


            private void AddCharacter(StringBuilder soundex, char ch)
            {
                if (soundex.Length == 0)
                    soundex.Append(ch);
                else
                {
                    string code = GetSoundexDigit(ch);
                    if (code != soundex[soundex.Length - 1].ToString())
                        soundex.Append(code);
                }
            }

            private string GetSoundexDigit(char ch)
            {
                string chString = ch.ToString();

                if ("AEIOUHWY".Contains(chString))
                    return "0";
                else if ("BFPV".Contains(chString))
                    return "1";
                else if ("CGJKQSXZ".Contains(chString))
                    return "2";
                else if ("DT".Contains(chString))
                    return "3";
                else if (ch == 'L')
                    return "4";
                else if ("MN".Contains(chString))
                    return "5";
                else if ("R".Contains(chString))
                    return "6";
                else
                    return ".";
            }

            private void RemovePlaceholders(StringBuilder soundex)
            {
                soundex.Replace(".", "");
            }

            private void FixLength(StringBuilder soundex)
            {
                int length = soundex.Length;
                if (length < 6)
                    soundex.Append(new string('0', 6 - length));
                else
                    soundex.Length = 6;
            }

            public int Compare(string value1, string value2)
            {
                int matches = 0;
                string soundex1 = GetSoundex(value1);
                string soundex2 = GetSoundex(value2);

                for (int i = 0; i < 6; i++)
                    if (soundex1[i] == soundex2[i]) matches++;

                return matches;
            }
        }
    }
}
}
Ňɏssa Pøngjǣrdenlarp
  • 38,411
  • 12
  • 59
  • 178
Minal Modi
  • 25
  • 5
  • What Soundex variant is that? You normally *drop* {AEIOUHWY} rather than assign `0` and retain the first letter. Smith would be "S530" – Ňɏssa Pøngjǣrdenlarp Nov 04 '15 at 01:42
  • They both output S50300 for me: https://dotnetfiddle.net/q4eGYF. I've built a soundex function you can find [here](http://stackoverflow.com/questions/11121936/dotnet-soundex-function/32752520#32752520) if that helps. – Daniel Flint Nov 04 '15 at 01:43
  • I have checked thoroughly and I am sure it should be 25030 or 2530 as I don't want it to check the first character of the name but I want it to get the value instead. Any ideas? – Minal Modi Nov 04 '15 at 01:49
  • what do you mean by you don't want to check the value first character but you want value? – Jigneshk Nov 04 '15 at 01:51
  • for example it is encoding as S5030 or whatever. I want it to not encode the "S" but I want it to encode the value or digit within my if statement which in terms for "SMITH" should be 25030 and the first encoding character should be a "2" instead of the "S" – Minal Modi Nov 04 '15 at 01:53
  • 1
    It sounds like then you should change the `AddCharacter` method to only contain the code within the `else` branch. But I don't think you can really call this "soundex" any more :) – Daniel Flint Nov 04 '15 at 02:00
  • I have tried to do that but it does not seem to be working – Minal Modi Nov 04 '15 at 02:02
  • It seems to work for me: https://dotnetfiddle.net/o1bNUK (edit: with an additional check for the length of the string being 0, of course) – Daniel Flint Nov 04 '15 at 02:05
  • okay thanks working now. Any ideas how I would change them if statements into cases? – Minal Modi Nov 04 '15 at 02:09
  • My implementation I linked to in my first comment uses `switch`... – Daniel Flint Nov 04 '15 at 02:11
  • I have tried using cases similar to your implementation but for me it seems to be saying I cannot convert char to a string even tho I have tried implementing my .toString before starting a switch statement or a switch case – Minal Modi Nov 04 '15 at 02:21

2 Answers2

0

you are calling FixLength function, that function append the extra '0' at the end of the string if string length is less than 6.

Thats the reason you are getting "250300" instead of "25030"

Jigneshk
  • 350
  • 1
  • 7
0

Per the discussion, changing the AddCharacter method like so will achieve what you're after:

private void AddCharacter(StringBuilder soundex, char ch)
{
    string code = GetSoundexDigit(ch);
    if (soundex.Length == 0 || code != soundex[soundex.Length - 1].ToString())
        soundex.Append(code);
}

But I wouldn't be referring to "soundex" anymore, since it no longer is.

Daniel Flint
  • 778
  • 5
  • 14