0

I'm struggling with trying to capture multiple matches within a group of lines in a text file.

The data takes on a bunch of forms like

AO22_X1N_A9PP96CTS_C24 SYN_INC_187 ( .A0 ( test_so6 ) , .A1 ( n2218 ) , .B0 ( U_PAUSEdata_ff_int_28_ ) , .B1 ( n2 ) , .Y ( n2597 ) ) ;

NAND3_X1R_A9PP96CTUL_C16 SYN_INC_154 ( .A ( n1563 ) , .B ( U_PAUSEwcnt ) , .C ( n1640 ) , .Y ( n1467 ) ) ;

The first piece is a name. Might want tat later but for now I am interested in the ports ex .A ( net ) Ideally I want to capture all the input net names (those with A,B,C,D etc) and the single output .Y ( net)

Eventually I want to store them into a hash where the output net is the key and the data is a ref to the array of inputs but for now I'm just trying to get all the input nets to be captured.

This is what I'm currently working with

open (FILE, "<maca") or die("Can not open $file");
  while (defined(my $cur_line = <FILE>)) {

    if ($cur_line =~ m/[A-Z].*?\.[A-C]\d* \( (.*?) \).*?;/mg) { 
      print "THIS gate $cur_line $1 $2 $3\n";  
      }
  }

I'm trying for this display

THIS gate NAND3_X1R_A9PP96CTUL_C16 SYN_INC_154 ( .A ( n1563 ) , .B ( U_PAUSEwcnt ) , .C ( n1640 ) , .Y ( n1467 ) ) ;

n1563 U_PAUSEwcnt n1640

But I get this. Actually I don't care about the first line just the 2nd. The first is for debugging. I thought the m would search multiple lines and the g would globally match the multi line string. What am I missing

THIS gate .B ( U_PAUSEwcnt ) , .C ( n1640 ) , .Y ( n1467 ) ) ;

n1640

togaclad
  • 21
  • 5
  • If `$cur_line` is a single line, you should not use mulitline regexp. Try instead to match the three fields in a single regexp without the `g` and `m` modifier – Håkon Hægland Mar 08 '19 at 22:00
  • 2
    Use a Verilog parser: https://metacpan.org/pod/Verilog-Perl – toolic Mar 09 '19 at 00:37
  • @Håkon Hægland it is actually multiple lines in a file. The multi line data is terminated by ; – togaclad Mar 11 '19 at 20:04
  • @toolic I had no idea that there was a verilog parser. Thanks for pointing that out. I'll see if the synthesized netlist can be pulled in. I might use that or just steal the code from the module. Just want a simple test script. No need to pull the whole design into memory – togaclad Mar 11 '19 at 20:04
  • The `vhier` helper script might be a good place to start. – toolic Mar 11 '19 at 20:05
  • @togaclad So to clarify the format: `$cur_line` is equal to 5 lines in your example, or is it equal to the top two lines (since there is a `;` after the second line)? – Håkon Hægland Mar 11 '19 at 21:50
  • @toolic. There is lots of good examples in those modules. The one thing that it does not do, and I really needed, was elaboration. Basically that is what I am after but with the example from the those modules and the great comments from everyone I think I have enough to get this done. – togaclad Mar 18 '19 at 15:00

1 Answers1

0

If I understand you correctly you're looking for something like this:

while ($data =~ /(\w+)\s*\((.+?)\)\s*;/gm) {
  my $line = $1;
  my $vals = $2;
  while ($vals =~ /\.(\w+)\s*\(\s*(\w+)\s*\)/g) {
    print "$line .. $1: $2.\n"
  }
}

I called the variable $data as it has all lines - correct? I first split the lines, capturing the string between the ( .. ) , then pull out the key - value pairs. Looks like all names are alpha-numeric + "_" which is nicely captured per \w.

Hope this helps?

Ossip
  • 1,046
  • 8
  • 20
  • Oh I see it now. I was using a while to read in my $cur_line = and then using the same $cur_line to regex as i detailed. I did not realize I needed to pull the global multi string match into a variable and then use a while to walk through it. – togaclad Mar 18 '19 at 14:57
  • Ah I see - it would have been easier if you posted your full code ;) no problem, you can continue doing that (is a good idea). above code will also work if you use it with individual lines. (of course then you can use `if` and kick out the `gm` ) – Ossip Mar 19 '19 at 17:36
  • hmmm no still fighting with it. Seemed to be working at first but is not matching past the first line. – togaclad Mar 20 '19 at 18:07
  • Wait it works. Okay I had to use the gms modifier on the first while. `my $data = read_file("maca"); while ($data =~ m/([A-Z]\w+)\s+(\w+)\s+\((.+?)\)\s*;/mgs) { my $port_net_pairs = $3; while ($port_net_pairs =~ m/\.(\w+)\s+\(\s+(\w+)\s+\)/g) { my $port = $1; my $net = $2; print " $port : $net\n"; } }` Now the only challenge I see is some ports are not just word chars but have square brackets. ex .B ( TX_cnt[0] ) – togaclad Mar 20 '19 at 18:42
  • woohoo solved that with `while ($port_net_pairs =~ m/\.(\w+)\s+\(\s+([\w|\[|\]]+)\s+\)/g) {` – togaclad Mar 20 '19 at 18:47
  • great! in fact you don't even need the `|` in the pattern `[\w|\[|\]]+` so you can use `[\w\[\]]+` . if this answered the question please hit accept, thanks! – Ossip Mar 21 '19 at 19:41