I am trying to keep rows in a csv file that match a certain string using regex in Rapidminer on a Windows 8 machine. I wrote regex that selects the right rows but the output does not retain line breaks and appears as a continuous string. I would appreciate any suggestions on how to keep the line breaks.
My file looks like this:
"ABCDEF","text",numbers,"JAN 1, 2014","text",numbers,10
"BCDEFG","text",numbers,"JAN 1, 2014","text",numbers,1
"CDEFGH","text",numbers,"FEB 1, 2014","text",numbers,12
"CDEFGH","text",numbers,"DEC 1, 2013","text",numbers,8
The following regexes select text from correct rows (1-3) but eliminate line breaks in output:
"[A-Z]*".*2014.*?(?=[\r\n$]+)
"[A-Z]*".*2014.*?(?=([\r\n]{2}))
"[A-Z]*".*2014.*?(?=([\r\n]{2}[\r\n$]*))
I tried multiline mode as well with the following regex, but with same result:
(?m)^"[A-Z]*".*2014.*?(?=[\r\n]+)$
My output looks like the following:
"ABCDEF","text",numbers,"JAN 1, 2014","text",numbers,10 "BCDEFG","text",numbers,"JAN 1, 2014","text",numbers,1 "CDEFGH","text",numbers,"FEB 1, 2014","text",numbers,12
Thank you in advance.
EDIT: With hwnd's and others' excellent suggestions, I came up with the following expression that worked in RapidMiner: (?m)^("[A-Z]+".2014.)\r\n