3

I'm trying to parse some data returned by a 3rd party app (a TSV file). I have all the data neatly parsed into each fields (see Parse a TSV file), but I don't know how to format some fields.
Sometimes the data in a field is encapsulated like this:

=T("[FIELD_DATA]")

(That's some sort of Excel formatting I believe.)
When that happens, specific chars are escaped by CHAR(ASCII_NUM), and the reste of string is also encapsulated like in the above example, without the = which only appears at the beginning of the field.

So, has anyone an idea how I could parse fields that look like this:

=T("- Merge User Interface of Global Xtra Alert and EMT Alert")&CHAR(10)&T("- Toaster ?!")&CHAR(10)&T("")&CHAR(10)&T("")&CHAR(10)&T("None")&CHAR(10)&T("")&CHAR(10)&T("None")

(any number of CHAR/T() groups).

I have been thinking of regex or looping the string, but I doubt the validity of this. Help, anyone?

Community
  • 1
  • 1
Antoine
  • 5,055
  • 11
  • 54
  • 82
  • The real fun is when you have either & or " inside your strings - potentially masked according to excel rules.... – weismat Mar 10 '10 at 18:29

2 Answers2

1

I would go similarly to Darin, but his regex wasn't working for me. I would use this one:

(=T|&CHAR|&T)(\("*([A-Za-z?!0-9 -]*)"*\))+

You'll find that Groups[2] (remember zero offset on those) will be the data inside of the () and "" if the "" exist. For example this will find:

- Merge User Interface of Global Xtra Alert and EMT Alert

in:

=T("- Merge User Interface of Global Xtra Alert and EMT Alert")

and:

10

in:

&CHAR(10)

If you have:

&T("")

it will produce a null in Groups[2].

Hope this helps.

Tim C
  • 1,934
  • 12
  • 25
0
class Program
{
    public static void Main(string[] args)
    {
        var input = @"=T(""- Merge User Interface of Global Xtra Alert and EMT Alert"")&CHAR(10)&T(""- Toaster ?!"")&CHAR(10)&T("""")&CHAR(10)&T("""")&CHAR(10)&T(""None"")&CHAR(10)&T("""")&CHAR(10)&T(""None"")";
        var matches = Regex.Matches(input, @"T\(\""([^\""]*)\""\)");
        foreach (Match match in matches)
        {
            Console.WriteLine(match.Groups[1].Value);
        }            
    }
}
Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928