3

I'm trying to use regex to parse the elements in the result of AT commands.

The structure of this is as follows:

OK
AT!GSTATUS?
!GSTATUS:
Current Time:  2420             Temperature: 30
Reset Counter: 1                Mode:        ONLINE
System mode:   LTE              PS state:    Attached
LTE band:      B30              LTE bw:      10 MHz
LTE Rx chan:   1234             LTE Tx chan: 12345
LTE CA state:  NOT ASSIGNED
EMM state:     Registered       Normal Service
RRC state:     RRC Idle
IMS reg state: No Srv

PCC RxM RSSI:  -74              RSRP (dBm):  -103
PCC RxD RSSI:  -74              RSRP (dBm):  -104
Tx Power:      0                TAC:         123A (1234)
RSRQ (dB):     -12.4            Cell ID:     1234AB56 (12345678)
SINR (dB):      9.4

I wish to capture every attribute behind the colon in one group and the result in another group in the match.


  • | Match |

  • | Group 1 | Group 2 |

  • Cell ID | 1232AA96 (12345678) |
  • TAC | 123A (1234) |

Currently, I have come up with:

r" ([0-9A-Za-z()]+):\s*([0-9A-Za-z()]+\s?[0-9A-Za-z()]+[\n])" gm

https://regex101.com/r/DS6IIk/1

What would be the best way of approaching this?

EM-Creations
  • 4,195
  • 4
  • 40
  • 56
Falco
  • 61
  • 7
  • Maybe applying some of the logic form this will help? https://stackoverflow.com/questions/16099975/regular-expression-to-split-key-value – EM-Creations Jul 17 '18 at 10:48
  • The sample output is a bit confusing/ambiguous: There are two KV pairs in some lines, how exactly are they separated? (for example if they are separated with spaces, how many spaces should be matched?) On the 10th line: `EMM state: Registered Normal Service`, are the number of spaces between *Registered* and *Normal Service* the same as number of spaces used for separating KV pairs in one line? If so then how are you going to distinguish the aforementioned cases? – fardjad Jul 17 '18 at 10:49
  • 3
    Try https://regex101.com/r/DS6IIk/2 – Wiktor Stribiżew Jul 17 '18 at 10:50
  • @fardjad They're separated by a colon `:` and then any number of spaces after the colon. So the OP needs everything before the colon in group 1 and everything after the colon in group 2. They're trying to separate the key/value pairs. – EM-Creations Jul 17 '18 at 10:52
  • @EM-Creations I know that the keys and values are separated by colons, but there are cases where there are two Key/Value pairs in one line, like this: `RSRQ (dB): -12.4 Cell ID: 1234AB56 (12345678)`. As I understood, the pairs are separated with a fixed number of spaces, and looks like spaces are allowed in both keys and values. – fardjad Jul 17 '18 at 10:56
  • @fardjad Ah in that case I believe line breaks shouldn't matter. – EM-Creations Jul 17 '18 at 11:47

1 Answers1

1

If your key-value pairs can have "words" that are only separated with a single space (and your sample proves it they do), you may match and capture singe whitespace separated non-whitespace chunks before a colon and after a colon and any amount of whitespace:

(\S+(?: \S+)*): *(\S+(?: \S+)*)

See the regex demo

To match any horizontal whitespace, replace the spaces with [^\S\r\n] pattern.

Details

  • (\S+(?: \S+)*) - Group 1: one or more non-whitespace chars followed with 0+ conseutive occurrences of a single space (if you use [^\S\r\n]*, any zero or more whitespaces other than LF and CR) and then 1+ non-whitespace chars
  • : * - a colon followed with 0+ spaces
  • (\S+(?: \S+)*) - Group 2: same as Group 1 pattern.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563