1

I am looking at ways to to use NOT regexp_like to exclude various terms from a string in the most efficient manner (as background, this data is used to find births that occurred outside the hospital system):

        regexp_like(note_text,'(birth|home|deliver|ambulance|car).{0,20}(deliv|birth|home|ambulance|car)','i')
and NOT regexp_like(note_text,'(apgar|CESAREAN|birth).{0,10}(minute|section)','i')

So I am doing a pile on of the NOT regexp_like to exclude cesarean births and apgar scores - am I better off to separate into like terms such as

and NOT regexp_like(note_text,'(apgar).{0,10}(minute)','i')
and NOT regexp_like(note_text,'(CESAREAN).{0,10}(section|operation)','i')

to make it more efficient?

Also as the regexp_like is an operator and not a function, is there is a way online to see what it is doing ?

collapsar
  • 17,010
  • 4
  • 35
  • 61
  • Performance-wise, you should be fine wrapping the negated terms into a single one with regexen structured like yours. Note however that your two samples approaches differ ( `apgar section` would only be filtered by the first ). You may consult the query plan for nested queries with individual regexen ( ie. `( select y.note_text from ( select x.note_text from ( select from ... where regexp_like(note_text, ) ) x where not regex_like ( x.note_text, ) ) y where not regex_like ( y.note_text, )` to reduce result set size before applying more complex filters. – collapsar Oct 18 '19 at 15:51
  • 1
    Also perhaps `ca?esarian`. – William Robertson Oct 18 '19 at 22:31

0 Answers0