Hi – I don’t have a ton of shell scripting experience, and I need to create a bash script to split a single large note field into an array of individual notes, using a regex (or multiple regexes) as delimiters. My input looks like this:
This is the first note (AA 01/23 10:00A)This is the second note(AB 01/24 11:00P) This is the third note (C101/25/201512:15A)This is the fourth (and final) note(D2 03/10 03:15P)
My array needs to look like this:
This is the first note AA 01/23 10:00A
This is the second note AB 01/24 11:00P
This is the third note C1 01/25/2015 12:15A
This is the fourth (and final) note D2 03/10 03:15P
Details:
- the notes can contain parentheses, hence my thought that I will need to use a regex instead of just splitting after each “)”
- the dates in note “tags” (the part contained within the parentheses) can have two distinct formats – some have spaces before and after the date with just a mm/dd date, and others show the date as mm/dd/yyyy with no spaces before and after.
- the note tags always begin with “(AA”, where AA can be any combination of uppercase alpha and numeric characters
- the note tags always end with “HH:MMA)” where HH is valid hours, MM is valid minutes, and the final character before the ) is either A or P.
I’ve defined two regex’s to identify the beginning and end of the note tag, but I’m at a loss as to how to actually get the data into an array. My regexes are:
starttag= "\([A-Z0-9]{2}"
endtag= "\d+:\d+[A|P]\)"
I’ve tried to create an array using IFS, but it appears that an IFS cannot contain multiple characters – correct? My results appear to be splitting the input on every character in my regex, instead of evaluating the entire regex as a single delimiter.
Any help would be greatly appreciated.