I am attempting to locate(then extract) a repeated phrase by using the below code. I require phrases beginning with "approximately" and ending in "closed".
For example "approximately $162.9 million in total assets and $144.5 million in total deposits was closed"
str_locate(x,"(\b[Aa]pproximately\b)(.*)(\b[Cc]losed\b)")
str_extract(x,"(\b[Aa]pproximately\b)(.*)(\b[Cc]losed\b)")
The above code returns NA for phrase start and end points. Here is a sample of the character vector where the phrases are located.(it is a webpage of publicly available FDIC information)
"206-4662).\r\n\r\nDecember \r\n\r\n\r\n Western National Bank, Phoenix, AZ with approximately $162.9 million in total assets and $144.5 million in total deposits was closed. Washington Federal, Seattle, WA has agreed to assume all deposits excluding certain brokered deposits.\r\n(PR-195-2011) \r\n\r\n\r\n\r\n Premier Community Bank of the Emerald Coast, Crestview, FL with approximately $126.0 million in total assets and $112.1 million in total deposits was closed. Summit Bank, N.A., Panama City, FL has agreed to assume all deposits.\r\n(PR-194-2011)"
I may be using reg expression incorrectly as i am new to it so any advice much appreciated.