0

Can C# int data type hold culture specific numbers like Eastern Arabic numbers? E.g. "123" will be

١٢٣

I’m working with SoapUI to send requests and receive responses. The web service is written in c#.

However when I enter these Eastern Arabic numbers in Soap UI, it says

“The value cannot be parsed”.

It’s not clear if it’s Soap UI issue or c# issue.

Can anyone help?

Appreciate your answers!

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • 4
    1, 2 ... are in fact not english but arabic numbers. Or do you literally want to input the arabic translation of `one` as a word? That won't work neither for english nor arabic or any other languag – derpirscher Oct 18 '20 at 13:12
  • You need to look up the difference between the abstract integers that are stored in an int variable and the process of parsing a string to an integer. – Tobias Kildetoft Oct 18 '20 at 13:14
  • 1
    @derpirscher hmm then are they called persian numbers? I meant these “١ ٢ ٣" – programmingNoob Oct 18 '20 at 13:22
  • @Tobias Kildetoft i know the difference between them. Which part of my question suggested that I am trying to parse string to an int?? – programmingNoob Oct 18 '20 at 13:26
  • 3
    @programmingNoob The fact that “١ ٢ ٣" are characters that represent numbers. C# and most other languages standardize to using arabic digits to represent numbers. So you can do `int x = 1;` but not `int x = ٣;` – juharr Oct 18 '20 at 13:31

2 Answers2

2

You can try using char.GetNumericValue to convert culture specific digits (e.g. Persian) into common 0..9:

private static bool TryParseAnyCulture(string value, out int result) {
  result = default(int);

  if (null == value)
    return false;

  StringBuilder sb = new StringBuilder(value.Length);

  foreach (char c in value) {
    double d = char.GetNumericValue(c);

    // d < 0      : character is not a digit, like '-'
    // d % 1 != 0 : character represents some fraction, like 1/2
    if (d < 0 || d % 1 != 0)
      sb.Append(c);
    else
      sb.Append((int)d);
  }

  return int.TryParse(sb.ToString(), out result);
}

Demo:

string value = "١٢٣"; // Eastern Arabic Numerals (0..9 are Western)

Console.Write(TryParseAnyCulture(value, out var result) ? $"{result}" : "???");

Outcome:

123
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
1

The int type (and any other numeric types) just stores values and doesn't care/know whatever format the original string is in. String representation is the thing that only affects input and output

C# supports locales for internationalization through System.Globalization.CultureInfo and you just need to specify the correct culture (Persian in this case) so that printing and parsing work correctly. In CultureInfo there's NumberFormatInfo.NativeDigits that stores the native digits of that locale. If you set NumberFormatInfo.DigitSubstitution correctly the output will be printed using the correct digit system. Unfortunately while that works for formatted output, Int.Parse doesn't use that information to parse numbers in native digits so you have to translate the digits yourself. Here's a solution that works for any cultures

using System;
using System.Globalization;

public class Program
{
    public static string GetWesternRepresentation(string input, CultureInfo cultureInfo)
    {
        var nativeDigits = cultureInfo.NumberFormat.NativeDigits;
        return input.Replace(cultureInfo.NumberFormat.NumberDecimalSeparator, ".")
                    .Replace(cultureInfo.NumberFormat.NumberGroupSeparator, ",")
                    .Replace(cultureInfo.NumberFormat.NegativeSign, "-")
                    .Replace(cultureInfo.NumberFormat.PositiveSign, "+")
                    .Replace(nativeDigits[0], "0")
                    .Replace(nativeDigits[1], "1")
                    .Replace(nativeDigits[2], "2")
                    .Replace(nativeDigits[3], "3")
                    .Replace(nativeDigits[4], "4")
                    .Replace(nativeDigits[5], "5")
                    .Replace(nativeDigits[6], "6")
                    .Replace(nativeDigits[7], "7")
                    .Replace(nativeDigits[8], "8")
                    .Replace(nativeDigits[9], "9");
    }

    public static void Main()
    {
        try
        {
            var culture = new CultureInfo("fa"); // or fa-Ir for Iranian Persian
            string input = "۱۲۳";
            // string input = "١٢٣";    // won't work although looks almost the same
            string output = GetWesternRepresentation(input, culture);
            Console.WriteLine("{0} -> {1}", input, output);
            int number = Int32.Parse(output, CultureInfo.InvariantCulture);
            Console.WriteLine("Value: {0}", number);
        }
        catch (FormatException)
        {
            Console.WriteLine("Bad Format");
        }
        catch (OverflowException)
        {
            Console.WriteLine("Overflow");
        }
    }
}

You can try this on .NET Fiddle

Now you may see that when changing the input to the commented out line it won't work although the strings look almost the same. That's because your digits above above are Eastern Arabic digits (٠١٢٣٤٥٦٧٨٩ - code points U+0660-U+0669) and not Persian digits (۰۱۲۳۴۵۶۷۸۹ - code points U+06F0-U+06F9)

phuclv
  • 37,963
  • 15
  • 156
  • 475