1

I'm trying to use a regular expression to parse a log file generated from a dir command from PSFTP.

Example Dir example 1

drwxr-xr-x 1 0        0                  0 Jun 21 13:13 .
drwxr-xr-x 1 0        0                  0 Jun 21 13:13 ..
-rw-r--r-- 1 0        0                897 Jun 20 15:02 EQA.txt
-rw-r--r-- 1 0        0                897 Jun 20 15:06 EQA1.txt
-rw-r--r-- 1 0        0                897 Jun 16 20:41 Test.txt
-rw-r--r-- 1 0        0                897 Jun 16 21:46 Test1.txt
-rw-r--r-- 1 0        0                897 Jun 21 13:13 Test4.txt
-rw-r--r-- 1 0        0                913 May 31 18:01 test.123456789.txt
psftp> bye 

Example Dir example 2

drwx------    2 MikePC-apps users        4096 Apr  5  2016 .
drwx------    4 MikePC-apps users        4096 Jan 20  2016 ..
-rw-r--r--    1 MikePC-apps users          82 Apr  5  2016 test.txt.$01
-rw-r--r--    1 MikePC-apps users          82 Aug 10  2016 test.txt.$02
-rw-r--r--    1 MikePC-apps users          82 Aug 10  2016 test.txt.asc
-rw-r--r--    1 MikePC-apps users          82 Aug 10  2016 test1.txt.$01
-rw-r--r--    1 MikePC-apps users        1927 Apr  4  2016 test.zip

So from what I found around the net, if a file is older than 6 months or in the future, the year is displayed instead of time in the day.

For example 1, I'm using a regex :/d/d/s .*.*.* followed by substring function to retrieve file names.

But I don't know how to approach the second example. I was hoping maybe there's a parameter for dir command to include the time stamp so I can use the same regex. Or maybe there is another regular expression that can handle both examples.

Many thanks!

bli
  • 7,549
  • 7
  • 48
  • 94
Bonobo
  • 26
  • 6

1 Answers1

0

Try following :

            string[] inputs = {
                                  "drwxr-xr-x 1 0        0                  0 Jun 21 13:13 .",
                                  "ddrwxr-xr-x 1 0        0                  0 Jun 21 13:13 ..",
                                  "d-rw-r--r-- 1 0        0                897 Jun 20 15:02 EQA.txt",
                                  "d-rw-r--r-- 1 0        0                897 Jun 20 15:06 EQA1.txt",
                                  "d-rw-r--r-- 1 0        0                897 Jun 16 20:41 Test.txt",
                                  "d-rw-r--r-- 1 0        0                897 Jun 16 21:46 Test1.txt",
                                  "d-rw-r--r-- 1 0        0                897 Jun 21 13:13 Test4.txt",
                                  "d-rw-r--r-- 1 0        0                913 May 31 18:01 test.123456789.txt",
                                  "drwx------    2 MikePC-apps users        4096 Apr  5  2016 .",
                                  "drwx------    4 MikePC-apps users        4096 Jan 20  2016 ..",
                                  "-rw-r--r--    1 MikePC-apps users          82 Apr  5  2016 test.txt.$01",
                                  "-rw-r--r--    1 MikePC-apps users          82 Aug 10  2016 test.txt.$02",
                                  "-rw-r--r--    1 MikePC-apps users          82 Aug 10  2016 test.txt.asc",
                                  "-rw-r--r--    1 MikePC-apps users          82 Aug 10  2016 test1.txt.$01",
                                  "-rw-r--r--    1 MikePC-apps users        1927 Apr  4  2016 test.zip"
                              };

            string pattern = @"^(?'attrib'[^\s]+)\s+(?'links'[^\s]+)\s+(?'owner'[^\s]+)\s+(?'group'[^\s]+)\s+(?'size'[^\s]+)\s+(?'date'.+)\s+(?'filename'[^$]+)$";

            foreach (string input in inputs)
            {
                Match match = Regex.Match(input, pattern);
                Console.WriteLine("attrib : '{0}', links : '{1}', owner : '{2}', group : '{3}', size : '{4}', date : '{5}', filename : '{6}'",
                    match.Groups["attrib"].Value, match.Groups["links"].Value, match.Groups["owner"].Value, match.Groups["group"].Value,
                    match.Groups["size"].Value, match.Groups["date"].Value, match.Groups["filename"].Value); 
            }
            Console.ReadLine();
jdweng
  • 33,250
  • 2
  • 15
  • 20
  • Thanks for the solution, I modify the Pattern to `string pattern = @"^(?'attrib'[^\s]+)\s+(?'links'[^\s]+)\s+(?'owner'[^\s]+)\s+(?'group'[^\s]+)\s+(?'size'[^\s]+)\s+(?'date'.+)\s+(?'filename'.+)$";` this is also included `test1.txt.$01` as filename. +1 (sorry cant up vote) Thanks – Bonobo Jun 22 '17 at 20:13
  • The $01 is part of the filename – jdweng Jun 22 '17 at 20:29
  • I tested the regex `([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+(.+)\s+([^$]+)` on [Regexr](http://regexr.com/) and it stopped at $ due to [^] is a negated set. I removed $ at the end of the pattern because the string is already split into sentence – Bonobo Jun 22 '17 at 20:54