-1

I have a text file

    ;   Message Number
    ;   |         Time Offset (ms)
    ;   |         |        Type
    ;   |         |        |        ID (hex)
    ;   |         |        |        |     Data Length
    ;   |         |        |        |     |   Data Bytes (hex) ...
    ;   |         |        |        |     |   |
    ;---+--   ----+----  --+--  ----+---  +  -+ -- -- -- -- -- -- --
         1)         2.0  Rx         0400  8  01 5A 01 57 01 D2 A6 02 
         2)         8.6  Rx         0500  8  02 C1 02 C9 02 BE 02 C2 
         3)        36.2  Rx         0401  8  01 58 01 59 01 01 01 01 
         4)        41.7  Rx         01C4  8  27 9C 64 8C 00 03 E8 08 
         5)        43.1  Rx         0501  8  02 C0 02 C1 02 C6 02 C0 
         6)        62.7  Rx         01C2  8  27 9C 60 90 00 0F 04 08 

and i am trying to collect just the ID from this file. I have the expression and have tested that it works, but when i try and collect the list it gives me the whole line instead of just the ID.

        var ofd = new OpenFileDialog
        {
            Filter = "TRC File (*.trc*)|*.trc*",
            Multiselect = true,
        };

        ofd.ShowDialog();

        string path = ofd.FileName;
        List<string> alllinesText = File.ReadAllLines(path).ToList();
        foreach (string id in alllinesText)
        {
            Regex rx = new Regex(@"\d\d[\d|\w][\d|\w]\s\s");
            Console.Write(id.ToString());
            MatchCollection matches1 = rx.Matches(id);
            Console.WriteLine(matches1);

        }

        foreach (string data in alllinesText)
        {
            Regex rx2 = new Regex(@"[\w\d][\d\w].[\w\d][\d\w].[\w\d][\d\w].[\w\d][\d\w].[\w\d][\d\w].[\w\d][\d\w].[\w\d][\d\w].[\w\d][\d\w]");
            Console.Write(data.ToString());
            MatchCollection matches2 = rx2.Matches(data);
        }

The output is

     28817)    347963.1  Rx         01C2  8  01 00 00 00 00 00 00 6F System.Text.RegularExpressions.MatchCollection
     28818)    347966.3  Rx         04E2  8  64 04 10 15 F5 00 00 08 System.Text.RegularExpressions.MatchCollection
     28819)    347967.2  Rx         01C4  8  27 14 63 8C 00 03 E7 08 System.Text.RegularExpressions.MatchCollection
     28820)    348017.0  Rx         03C4  8  7F 8A 7F 80 7F FA 96 0F System.Text.RegularExpressions.MatchCollection
     28821)    348023.1  Rx         0405  8  01 57 01 58 01 DB 93 02 System.Text.RegularExpressions.MatchCollection
     28822)    348029.6  Rx         0505  8  02 BB 02 BC 02 BD 02 BF System.Text.RegularExpressions.MatchCollection
  • FYI: `\d` is already included in `\w`, you can simplified a bit your regex by replacing all the `[\w\d]` with simple `\w` and moreover if you want to match hexa, use `[a-fA-F0-9]` – Toto Jun 17 '19 at 16:30
  • `Console.Write(data.ToString())` writes the entire line, not the text that matches the expression. In fact, you discard the text that matches the expression. – Dour High Arch Jun 17 '19 at 16:30
  • Do you need to use regex? It looks like a fixed width file (minus the weird multi-line header). I would think you could get away with substringing each line. – Broots Waymb Jun 17 '19 at 16:31
  • 5
    Possible duplicate of [Read fixed-width record from text file](https://stackoverflow.com/questions/162727/). – Dour High Arch Jun 17 '19 at 16:32
  • To get the numbers after `Rx` you may use `Regex.Matches(s, @"\bRx\s+(\d+)").Cast().Select(x => x.Groups[1].Value)` – Wiktor Stribiżew Jun 17 '19 at 16:36
  • If the id has letters and underscores use `\w` instead of `\d`. – Wiktor Stribiżew Jun 17 '19 at 17:10

1 Answers1

0

My guess is that here, we just might want to add a capturing group in a char class, maybe similar to:

([A-Z0-9]{4})

RegEx Demo

Test

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string pattern = @"([A-Z0-9]{4})";
        string input = @" ;   Message Number
    ;   |         Time Offset (ms)
    ;   |         |        Type
    ;   |         |        |        ID (hex)
    ;   |         |        |        |     Data Length
    ;   |         |        |        |     |   Data Bytes (hex) ...
    ;   |         |        |        |     |   |
    ;---+--   ----+----  --+--  ----+---  +  -+ -- -- -- -- -- -- --
         1)         2.0  Rx         0400  8  01 5A 01 57 01 D2 A6 02 
         2)         8.6  Rx         0500  8  02 C1 02 C9 02 BE 02 C2 
         3)        36.2  Rx         0401  8  01 58 01 59 01 01 01 01 
         4)        41.7  Rx         01C4  8  27 9C 64 8C 00 03 E8 08 
         5)        43.1  Rx         0501  8  02 C0 02 C1 02 C6 02 C0 
         6)        62.7  Rx         01C2  8  27 9C 60 90 00 0F 04 08 ";
        RegexOptions options = RegexOptions.Multiline;

        foreach (Match m in Regex.Matches(input, pattern, options))
        {
            Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
        }
    }
}

C# Demo

Emma
  • 27,428
  • 11
  • 44
  • 69
  • 1
    Thank you! This helped. The code i ended up with was slightly different but it has given me the list i need. – Andrew Lerma Jun 17 '19 at 16:44
  • 1
    If i were to want that m.value in a list to combine with another list what would be the best way to do this? If you don't mind me asking @Emma – Andrew Lerma Jun 17 '19 at 17:03
  • 1
    Keep in mind that this works as long as Message number is less that 4 digits. If Message number can reach 4 digits or more, you may need to use @"Rx *([A-Z0-9]{4})" and then m.Value.Substring(m.Value.Length -4) to get the actual result you want. – Terry Tyson Jun 17 '19 at 17:03
  • @Terry No need to substring, the value is captured, thus access Group[1]. See [my comment](https://stackoverflow.com/questions/56635380/parsing-with-regex-out-of-text-file#comment99842262_56635380) to the question. – Wiktor Stribiżew Jun 17 '19 at 17:07
  • @WiktorStribiżew Thanks, I missed that. – Terry Tyson Jun 17 '19 at 17:11
  • @Andrew if you need a relevant answer please add the details to the question. Right now, it is not quite clear why you are using a regex at all, what the extraction rules are and what exacy result you expect. – Wiktor Stribiżew Jun 17 '19 at 17:19