19

The .NET Framework gives us the Format method:

string s = string.Format("This {0} very {1}.", "is", "funny");
// s is now: "This is very funny."

I would like an "Unformat" function, something like:

object[] params = string.Unformat("This {0} very {1}.", "This is very funny.");
// params is now: ["is", "funny"]

I know something similar exists in the ANSI-C library (printf vs scanf).

The question: is there something similiar in C#?

Update: Capturing groups with regular expressions are not the solution I need. They are also one way. I'm looking for a system that can work both ways in a single format. It's OK to give up some functionality (like types and formatting info).

doekman
  • 18,750
  • 20
  • 65
  • 86

8 Answers8

16

There's no such method, probably because of problems resolving ambiguities:

string.Unformat("This {0} very {1}.", "This is very very funny.")
// are the parameters equal to "is" and "very funny", or "is very" and "funny"?

Regular expression capturing groups are made for this problem; you may want to look into them.

mqp
  • 70,359
  • 14
  • 95
  • 123
5

Regex with grouping?

/This (.*?) very (.*?)./
annakata
  • 74,572
  • 17
  • 113
  • 180
5

If anyone's interested, I've just posted a scanf() replacement for .NET. If regular expressions don't quite cut it for you, my code follows the scanf() format string quite closely.

You can see and download the code I wrote at http://www.blackbeltcoder.com/Articles/strings/a-sscanf-replacement-for-net.

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466
  • You class does not seem to work with floating point numbers. Input = "von 95.50 bis 120.00 EUR" with format "von %f bis %f %s" should return {95.50, 120.00, "EUR"} but returns {9550, 12000, "EUR"}. – Krisztián Balla Jan 21 '14 at 13:20
  • Ok, I found the problem. Your call to double.TryParse should be: double.TryParse(input.Extract(start, input.Position), System.Globalization.NumberStyles.Float, System.Globalization.CultureInfo.InvariantCulture.NumberFormat, out result) otherwise it will only work for people using a dot as decimal separator in their locale settings. – Krisztián Balla Jan 21 '14 at 13:34
  • Thanks, I'll take a look at that. Sorry for missing your earlier comment. – Jonathan Wood Jan 21 '14 at 17:09
  • @JonathanWood - I am using your ScanFormatted class, but how exactly are you suppose to read a number, with a leading zero? My string is "20 098 20 46 22.000" I use a format string of "%2d %3d %2d %2d %f" but it returns "20,0,98,20,46" instead of "20,98,20,46,22". – Security Hound Sep 23 '21 at 06:00
  • @SecurityHound: It's been 10 years since I've looked at that but it appears the issue might be because it's interpreting a number that starts with 0 as octal (which would terminate at an '8' or '9' digit). You could pull that part to get it working. Just comment out the first `else` clause in `ParseDecimal()`. – Jonathan Wood Sep 23 '21 at 14:57
  • @JonathanWood - I can workaround by reading in a string then converting that to an integer, but I will take a look at that, it took me 2 hours to realize that it was .NET parse handling 22.000 as 22. I appreciate the class and your response :-) – Security Hound Sep 23 '21 at 15:02
4

You could do string[] parts = string.Split(' '), and then extract by the index position parts[1] and parts [3] in your example.

endian
  • 4,234
  • 8
  • 34
  • 42
3

Yep. These are called "regular expressions". The one that will do the thing is

This (?<M0>.+) very (?<M1>.+)\.
Anton Gogolev
  • 113,561
  • 39
  • 200
  • 288
  • I think the whole point is to circum-vent the overly complex and cryptic syntax of RegEx and provide something lightweight and simple in the general cases where you don't need the complexity. Thus the format of string.Format is very desireable and self-describing for the general cases where you would want to pattern match. – Marchy Dec 16 '09 at 20:24
1

@mquander: Actualy, PHP solves it even different:

$s = "This is very very funny.";
$fmt = "This %s very %s.";
sscanf($s, $fmt, $one, $two);
echo "<div>one: [$one], two: [$two]</div>\n";
//echo's: "one: [is], two: [very]"

But maybe your regular expression remark can help me. I just need to rewrite "This {0} very {1}." to something like: new Regex(@"^This (.*) very (.*)\.$"). This should be done programmatical, so I can use one format string on the public class interface.

BTW: I've already have a parser to find the parameters: see the Named Format Redux blog entry by Phil Haack (and yes, I also want named paramters to work both ways).

doekman
  • 18,750
  • 20
  • 65
  • 86
  • PHP behaves the same way as standard C sscanf, in this example. Sscanf does not read whitespace into a `%s` variable. – Sjoerd Oct 01 '12 at 11:58
1

I came across the same problem, i belive that there is a elegante solution using REGEX... but a came up with function in C# to "UnFormat" that works quite well. Sorry about the lack of comments.

    /// <summary>
    /// Unformats a string using the original formating string. 
    /// 
    /// Tested Situations:
    ///    UnFormat("<nobr alt=\"1\">1<nobr>", "<nobr alt=\"{0}\">{0}<nobr>") : "1"
    ///    UnFormat("<b>2</b>", "<b>{0}</b>") : "2"
    ///    UnFormat("3<br/>", "{0}<br/>") : "3"
    ///    UnFormat("<br/>4", "<br/>{0}") : "4"
    ///    UnFormat("5", "") : "5"
    ///    UnFormat("<nobr>6<nobr>", "<nobr>{0}<nobr>") : "6"
    ///    UnFormat("<nobr>2009-10-02<nobr>", "<nobr>{0:yyyy-MM-dd}<nobr>") : "2009-10-02"
    ///    UnFormat("<nobr><nobr>", "<nobr>{0}<nobr>") : ""
    ///    UnFormat("bla", "<nobr>{0}<nobr>") : "bla"
    /// </summary>
    /// <param name="original"></param>
    /// <param name="formatString"></param>
    /// <returns>If an "unformat" is not possible the original string is returned.</returns>
    private Dictionary<int,string> UnFormat(string original, string formatString)
    {
       Dictionary<int, string> returnList = new Dictionary<int, string>();

       try{
          int index = -1;

          // Decomposes Format String
          List<string> formatDecomposed = new List<string> (formatString.Split('{'));
          for(int i = formatDecomposed.Count - 1; i >= 0; i--)
          {
             index = formatDecomposed[i].IndexOf('}') + 1;

             if (index > 0 && (formatDecomposed[i].Length - index) > 0)
             {
                formatDecomposed.Insert(i + 1, formatDecomposed[i].Substring(index, formatDecomposed[i].Length - index));
                formatDecomposed[i] = formatDecomposed[i].Substring(0, index);
             }
             else
                //Finished
                break;
          }

          // Finds and indexes format parameters
          index = 0;
          for (int i = 0; i < formatDecomposed.Count; i++)
          {
             if (formatDecomposed[i].IndexOf('}') < 0)
             {
                index += formatDecomposed[i].Length;
             }
             else
             {
                // Parameter Index
                int parameterIndex;
                if (formatDecomposed[i].IndexOf(':')< 0)
                   parameterIndex = Convert.ToInt16(formatDecomposed[i].Substring(0, formatDecomposed[i].IndexOf('}')));
                else
                   parameterIndex = Convert.ToInt16(formatDecomposed[i].Substring(0, formatDecomposed[i].IndexOf(':')));

                // Parameter Value
                if (returnList.ContainsKey(parameterIndex) == false)
                {
                   string parameterValue;

                   if (formatDecomposed.Count > i + 1)
                      if (original.Length > index)
                         parameterValue = original.Substring(index, original.IndexOf(formatDecomposed[i + 1], index) - index);
                      else
                         // Original String not valid
                         break;
                else
                   parameterValue = original.Substring(index, original.Length - index);

                returnList.Add(parameterIndex, parameterValue);
                index += parameterValue.Length;
             }
             else
                index += returnList[parameterIndex].Length;

             }
          }

          // Fail Safe #1
          if (returnList.Count == 0) returnList.Add(0, original);
       } 
       catch
       {
          // Fail Safe #2
          returnList = new Dictionary<int, string>();
          returnList.Add(0, original);
       }

       return returnList;
    }
Nuno Rodrigues
  • 2,102
  • 1
  • 12
  • 6
-1

I reference earlier reply, wrote a sample see following

string sampleinput = "FirstWord.22222";

Match match = Regex.Match(sampleinput, @"(\w+)\.(\d+)$", RegexOptions.IgnoreCase);

if(match.Success){

    string totalmatchstring = match.Groups[0]; // FirstWord.22222
    string firstpart = match.Groups[1]; // FirstWord`
    string secondpart = match.Groups[2]; // 22222

}
FlyingTeller
  • 17,638
  • 3
  • 38
  • 53
Dvd Wang
  • 1
  • 1