0

I am trying to create a regular expression to remove formatting of financial values received in a string.

I have written code to remove the currency symbol, but am having problems removing the 1000's separator as it can be a any one of the following: , . '

This is what I have so far:

string pattern = @"\p{Sc}*(\s?\d+[.,]?\d*)\p{Sc}*";
string replacement = "$1";
string input = "here are the text values: $16,000.32 12.19 £16.29 €18.000,29  €18,29 ₹17,00,00,00,000.00";
string result = Regex.Replace(input, pattern, replacement);
Console.WriteLine(result);

How can I modify my code to also replace the 1000's separator and standardise the decimal notation?

Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69
  • 1
    [NumberStyles](https://learn.microsoft.com/en-us/dotnet/api/system.globalization.numberstyles?view=netframework-4.7.2) is your friend – Liam Jan 21 '19 at 10:36
  • "I had a problem and then I used regular expressions. Now I have two problems" - Regular expressions aren't a good fit for this problem. – Dan Jan 21 '19 at 10:42
  • The duplicate won't answer this question directly. But yes, he shouldn't use regex to parse decimals. Instead you should take the long route: split the values by space, then use `decimal.TryParse(v, NumberStyles.AllowCurrencySymbol | NumberStyles.Any, c, out decimal d)` to parse them. You can store all allowed `CultureInfo`s in a list/array and use a loop/LINQ to pass them to `decimal.TryParse`. If it returned `true` you have the decimal value. Then use `decimal.ToString(format)` to get your desired format. You build the result string with `String.Join` – Tim Schmelter Jan 21 '19 at 11:25

1 Answers1

0

You could use such pattern: (\d)[.,'](\d{3}).

Explanation: it will match any thousand separator you listed with [.,'] if it's preceeded by digit and followed by three digits.

Preceeding and following digits are captured into first and second capturing group, so you just need to replace match with \1\2, meaning you omit the separator.

Note: it would also remove decimal separator, if decimal part is longer than three digits.

Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69