1

I do a lot of file parsing, which involves parsing data types like decimal. To help make the code more readable, I've been using the following method:

public static decimal? ToDecimal(this string data)
{
    decimal result;
    if (decimal.TryParse(data, NumberStyles.Integer | NumberStyles.AllowDecimalPoint, CultureInfo.InvariantCulture, out result))
        return result;

    return null;
}

This works fine for decimals which are represented with a full-stop/period as the decimal separator. However, I'd like this function to work with other standard decimal separators, particularly the comma. (I read that there is also an Arabic decimal separator: http://en.wikipedia.org/wiki/Decimal_mark#Other_numeral_systems, but that presumably relies on being able to parse the eastern arabic numerals too).

The Culture.CurrentCulture wouldn't be appropriate because the data is not necessarily created on the machine that is doing the processing. So I've now got this:

private static CultureInfo CreateCultureWithNumericDecimalSeparator(string separator)
{
    var cultureInfo = (CultureInfo)CultureInfo.InvariantCulture.Clone();
    cultureInfo.NumberFormat.NumberDecimalSeparator = separator;
    return cultureInfo;
}

private static CultureInfo[] cultureInfos = new CultureInfo[]
{
    CultureInfo.InvariantCulture,
    CreateCultureWithNumericDecimalSeparator(",") // Normal comma
};

public static decimal? ToDecimal(this string data)
{
    foreach (CultureInfo cultureInfo in cultureInfos)
    {
        decimal result;
        if (decimal.TryParse(data, NumberStyles.Integer | NumberStyles.AllowDecimalPoint, cultureInfo, out result))
            return result;
    }

    return null;
}

This works, but parsing twice, especially given that TryParse is checking all sorts of settings (thousand separators, hex specifiers, currency symbols, exponents, etc) seems a little heavy. It's probably not a performance issue, but I'm curious to know if there's a more efficient method to do this, or possibly even an existing method within the framework? And maybe even a method that could cope with other numeral systems in modern use? Thanks.

Giles
  • 1,331
  • 1
  • 15
  • 30
  • 1
    seems pretty efficient to me. the one change I'd make is placing InvariantCulture at the end of the array so that your more specific cases get evaluated first. You arent parsing twice, you are only parsing as many times as it takes to get a match. could be once, could be every culture in the array. whatever it takes. good job. – Sam Axe Aug 09 '12 at 18:39
  • Without knowing the culture it might be tricky note that 1,100 in PL culture is one and one tenth and the other way around 1,100 in EN would be thousand and hundred. So you cannot read mind of your user and get it correctly. – Rafal Aug 09 '12 at 18:39
  • Thank you both for the feedback. I'm not too worried about confusing the thousand separators because it's very unusual to see those used in the B2B file formats I'm dealing with. The formatting of decimal has typically been done by machine rather than entered by hand. – Giles Aug 09 '12 at 19:03

1 Answers1

0

It seems that that the framework doesn't provide this directly. Interestingly, answers to another question suggest that the framework doesn't provide any mechanism for parsing Eastern Arabic numerals, even with the culture set. As the Eastern Arabic numerals requirement was just theoretical for my app, I'm sticking with the code I've already got above. If I implement anything more specific, I'll post it!

Community
  • 1
  • 1
Giles
  • 1,331
  • 1
  • 15
  • 30
  • Related to Culture.CurrentCulture not being appropriate for some apps, I've just noticed a new feature in .Net 4.5: CultureInfo.DefaultThreadCurrentCulture. Just thought I'd mention it, in case it's useful to anyone who comes across this from doing similar stuff. – Giles Aug 20 '12 at 15:48