5

I want to find whether string has exponential format or not. Currently i am checking like below.

var s = "1.23E+04";
var hasExponential = s.Contains("E");

But i know this is not the proper way. So can anyone please guide proper and fastest way to achieve my requirement?

Selvamz
  • 362
  • 3
  • 16

3 Answers3

7

If you also want to make sure it really is a number, not just a string with an 'E' in it, maybe a function like this can be helpful. The logic remains simple.

private bool IsExponentialFormat(string str)
{
    double dummy;
    return (str.Contains("E") || str.Contains("e")) && double.TryParse(str, out dummy);
}
sstan
  • 35,425
  • 6
  • 48
  • 66
1

Try to use regular expression?

string s = "1.23E+04";
string pattern = @"^\d{1}.\d+(E\+)\d+$";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
bool hasExponential = rgx.IsMatch(s);
nevermoi
  • 21
  • 4
  • performance should be affected when we use Regex. So can you tell performance effective way? – Selvamz Jun 12 '15 at 01:48
  • 1
    @Selvamz Performance is always affected, don't reject this because you **think** it might impact your solution, which it probably won't. – Bas Jun 12 '15 at 02:36
  • @Bas This method performs twice as slow as the method by sstan, or the second method I mentioned. – Der Kommissar Jun 12 '15 at 02:50
  • @EBrown but as opposed to whatever algorithm is also running, it might be 0.001 or 0.002 of your total runtime – Bas Jun 12 '15 at 14:04
  • @Bas That's an irrelevant statement if the OP is concerned about performance. It doesn't matter *what portion* of the runtime it is, if the OP wants the fastest algorithm, this is **not** it. – Der Kommissar Jun 12 '15 at 14:05
  • Or the OP doesn't know that this part of perf doesn't matter, like 99% of OPs concerned about perf http://ericlippert.com/2012/12/17/performance-rant/ – Bas Jun 12 '15 at 14:10
0

You could always split on e. Or use double.TryParse. Both work pretty well. (I would bet the ParseExponential2 is faster for valid ones, whereas ParseExponential1 could be faster for invalid ones.)

public static void _Main(string[] args)
{
    string[] exponentials = new string[] { "1.23E+4", "1.23E+04", "1.23e+4", "1.23", "1.23e+4e+4", "abce+def", "1.23E-04" };

    for (int i = 0; i < exponentials.Length; i++)
        Console.WriteLine("Input: {0}; Result 1: {1}; Result 2: {2}; Result 3: {3}", exponentials[i], (ParseExponential1(exponentials[i]) ?? 0), (ParseExponential2(exponentials[i]) ?? 0), (ParseExponential3(exponentials[i]) ?? 0));
}

public static double? ParseExponential1(string input)
{
    if (input.Contains("e") || input.Contains("E"))
    {
        string[] inputSplit = input.Split(new char[] { 'e', 'E' });

        if (inputSplit.Length == 2) // If there were not two elements split out, it's an invalid exponential.
        {
            double left = 0;
            int right = 0;

            if (double.TryParse(inputSplit[0], out left) && int.TryParse(inputSplit[1], out right) // Parse the values
                && (left >= -5.0d && left <= 5.0d && right >= -324) // Check that the values are within the range of a double, this is the minimum.
                && (left >= -1.7d && left <= 1.7d && right <= 308)) // Check that the values are within the range of a double, this is the maximum.
            {
                double result = 0;

                if (double.TryParse(input, out result))
                    return result;
            }
        }
    }

    return null;
}

public static double? ParseExponential2(string input)
{
    if (input.Contains("e") || input.Contains("E"))
    {
        double result = 0;

        if (double.TryParse(input, out result))
            return result;
    }

    return null;
}

public static double? ParseExponential3(string input)
{
    double result = 0;

    if (double.TryParse(input, out result))
        return result;

    return null;
}

If the ParseExponential1, ParseExponential2 or ParseExponential3 returns null, then it was invalid. This also allows you to parse it at the same time and get the value indicated.

The only issue with ParseExponential3 is that it will also return valid numbers if they are not exponential. (I.e., for the 1.23 case it will return 1.23.)

You could also remove the checks for the double range. They are just there for completeness.

Also, to prove a point, I just ran a benchmark with the code below, and the Regex option in the other answer by nevermoi took 500ms to run 100,000 times over the exponentials array, the ParseExponential1 option took 547ms, and the ParseExponential2 option 317ms. The method by sstan took 346ms. Lastly, the fastest was my ParseExponential3 method at 134ms.

string[] exponentials = new string[] { "1.23E+4", "1.23E+04", "1.23e+4", "1.23", "1.23e+4e+4", "abce+def", "1.23E-04" };

Stopwatch sw = new Stopwatch();
sw.Start();
for (int round = 0; round < 100000; round++)
    for (int i = 0; i < exponentials.Length; i++)
        ParseExponential1(exponentials[i]);
sw.Stop();
Console.WriteLine("Benchmark 1 (ParseExponential1) complete: {0}ms", sw.ElapsedMilliseconds);

sw.Reset();

sw.Start();
for (int round = 0; round < 100000; round++)
    for (int i = 0; i < exponentials.Length; i++)
        ParseExponential2(exponentials[i]);
sw.Stop();
Console.WriteLine("Benchmark 2 (ParseExponential2) complete: {0}ms", sw.ElapsedMilliseconds);
sw.Reset();
string pattern = @"^\d{1}.\d+(E\+)\d+$";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);

sw.Start();
for (int round = 0; round < 100000; round++)
    for (int i = 0; i < exponentials.Length; i++)
        rgx.IsMatch(exponentials[i]);
sw.Stop();
Console.WriteLine("Benchmark 3 (Regex Parse) complete: {0}ms", sw.ElapsedMilliseconds);
sw.Reset();

sw.Start();
for (int round = 0; round < 100000; round++)
    for (int i = 0; i < exponentials.Length; i++)
        IsExponentialFormat(exponentials[i]);
sw.Stop();
Console.WriteLine("Benchmark 4 (IsExponentialFormat) complete: {0}ms", sw.ElapsedMilliseconds);

sw.Start();
for (int round = 0; round < 100000; round++)
    for (int i = 0; i < exponentials.Length; i++)
        ParseExponential3(exponentials[i]);
sw.Stop();
Console.WriteLine("Benchmark 5 (ParseExponential3) complete: {0}ms", sw.ElapsedMilliseconds);

And the following method was added:

private static bool IsExponentialFormat(string str)
{
    double dummy;
    return (str.Contains("E") || str.Contains("e")) && double.TryParse(str, out dummy);
}
Der Kommissar
  • 5,848
  • 1
  • 29
  • 43
  • `e-` and `E-` are valid too. That's the trouble with being too clever to gain a microsecond. Simplicity ensures correctness, which should always be priority number one. – sstan Jun 12 '15 at 02:23
  • @sstan Ah yes, I forgot about the negatives. I'll make that change. – Der Kommissar Jun 12 '15 at 02:23
  • Plus, you are creating a bunch of new strings by splitting. Regex is much faster than that. – Bas Jun 12 '15 at 02:37
  • @Bas Have you benchmarked it? If not, then that statement is invalid. I just benckmarked it and the `Regex` option is the same speed as the string option, which is twice as slow as the non-string option. – Der Kommissar Jun 12 '15 at 02:41
  • Try with RegexOptions.Compiled – Bas Jun 12 '15 at 14:06
  • @Bas Even if you use `RegexOptions.Compiled` with `RegexOptions.IgnoreCase`, it's still just as slow. If you remove `RegexOptions.IgnoreCase` and replace `E` with `[Ee]`, it's still slower than the `double.TryParse` options. (Albeit not as slow as before, but still slower than any of the non-string/non-regex options.) You just can't squeeze the same performance out of a `Regex` like that. (Though, it *is* faster than my `string.Split` method with either of the `RegexOptions.Compiled` methods.) – Der Kommissar Jun 12 '15 at 14:10
  • @Bas I honestly don't care at this point, I have laid out the facts. Take them or leave them. I'm not going to argue with someone who is being pedantic about the situation. This isn't even *your* question and you're trying to be pedantic about it. – Der Kommissar Jun 12 '15 at 14:14