My text data is consistently separated by vertical lines ("|"), but the text between the vertical lines is rarely consistent and often includes characters that could be used as separators ("-", ",", and carriage returns). I would like there to only be 2 columns (report number and comment).
Goal:
ReportNumber | Report |
---|---|
4312822 | Comment: This person did a great job working with other -Class standing was 15/265 -Final academic average/standing was 83.51% /209 out of 265 |
3059758 | Comment, Part I: This is a dummy report. |
What the data looks like:
4312822|Comment: This person did a great job working with others.
-Class standing was 1/10
-Final academic average/standing was 83.51% /209 out of 265|
3059758|Comment, Part I: This is a dummy report.|
I've tried both read.delim and read.table:
Reports = read.delim('reports.txt', sep = "|", stringsAsFactors = FALSE, skipNul = TRUE, blank.lines.skip = TRUE)
The result, however, is jumbled and not split neatly by the "|"