I have the text:
This is a test. This is only a test! If there were an emergency, then Information would be provided for you.
I want to be able to determine which words start sentences. What I have now is:
$ cat <FILE> | perl -pe 's/[\s.?!]/\n/g;'
This just gets rid of punctuation and replaces them with newlines, giving me:
This
is
a
test
This
is
only
a
test
If
there
were
an
emergency,
then
Information
would
be
provided
for
you
From here I could somehow extract the words that have either nothing above them (start of file) or a blank space, but I am unsure of exactly how to do this.