1

I need to parse EDIFACT message in python.
To find segment e.g UNB I am trying to use regex

pattern = "UNB(.*?)(?<!\?)(\?\?)*[']"  

and test string

message = "UNA+456+6:54+654'UNB+64+654+54?'UNC+54+654+654'"  

Segment delimeter is ' (apostrophe) and ? is escape char. In RegexCoach the match string is UNB+64+654+54?'UNC+54+654+654'
That is right becouse the first apostrophe after UNB is escaped
But in Python 3.5

re.match(pattern,message)

return None :( Do you have idea where is error? Or suggestion for another solution?

Thank
test:

Regular expression visualization

Debuggex Demo

  • Is this your actual code? `patter` isn't defined and I got a SyntaxError when I tried `pattern = 'UNB(.*?)(?<!\?)(\?\?)*[']' `. – Kevin Oct 21 '16 at 12:38
  • I can't produce any result. Are you sure the pattern you provided is *exactly* as you are using it? – idjaw Oct 21 '16 at 12:40
  • patter is typo error. Right pattern. – Jan Červený Oct 21 '16 at 12:46
  • Yes, pattern is exactly the same. I try it in RegexCoach application and it's working. But in Python does not work. – Jan Červený Oct 21 '16 at 12:48
  • 2
    Try BOTS, open source Python EDI translator: http://bots.sourceforge.net/en/about_features.shtml – Andrew Oct 28 '16 at 17:51
  • I parse the EDI messages with regexes, you could see if you can port it to python. My approach if to split the message in different steps, segments, data elements, composite data elements. See an older version of the package https://github.com/php-edifact/edifact/blob/3bd7205d3e1d9fe9bc3f94cdb4ad40f7ee9dea11/src/EDI/Parser.php (the newer version contain parameters to deal with UNA changes). – sabas Dec 22 '16 at 10:33

0 Answers0