Let's say I have an input file that looks like this:
2016-06-03 21:00:14 > user1 has connected.
2016-06-03 21:00:14 > user1 has connected.
2016-06-03 21:00:15 > user1 has connected.
2016-06-03 21:00:22 > foobar disconnected.
2016-06-03 21:00:22 > foobar disconnected.
2016-06-03 21:00:29 > user2 has connected.
2016-06-03 21:00:29 > user2 has connected.
2016-06-03 21:00:29 > user2 has disconnected.
2016-06-03 21:00:30 > user2 has disconnected.
2016-06-03 21:00:30 > user2 has disconnected.
I could remove all of the duplicate consecutive lines ignoring first 2 columns with uniq -f2 file.txt
but I'm looking for a way to remove only the duplicates that have has connected.
in them so the output will look like this:
2016-06-03 21:00:14 > user1 has connected.
2016-06-03 21:00:22 > foobar disconnected.
2016-06-03 21:00:22 > foobar disconnected.
2016-06-03 21:00:29 > user2 has connected.
2016-06-03 21:00:29 > user2 has disconnected.
2016-06-03 21:00:30 > user2 has disconnected.
2016-06-03 21:00:30 > user2 has disconnected.
I suppose this could be done just by matching a fixed string ("has connected.") but I'm also interested in a command that would work with a regex.
I took a look at the answers to this question but couldn't modify the commands so they would work with the input I have.