SED delete comments but leave specific negated comment strings

Question

I'm trying to delete comments from file but what's important I want to leave specific strings:

## Something
# START
# END

These has to stay with rest not commented lines and I want to remove rest with "d" - this is important. I don't want to use print negation or other tricks because this sed command also process another things later with additional "-e".

Here is sample file:

# START
group1: user1@domain.com, user2@domain.com, user3@domain.com
group2: user3@domain.com, user4@domain.com

# S
#STAR
# start
# star
# comment is here
## Owner1
group3: user1@domain.com, user3@domain.com

## Owner2
group4: user4@domain.com, user3@domain.com

group3: user2@domain.com, user3@domain.com

# END

group5: user4@domain.com
alias1: user6@domain.com

I tried to use command like:

sed -e '/^#[^#]/d' sample.file

Which remove each line starting with "#" and next character is NOT "#" so it leaves "##" lines but how to manage removing rest without loosing # START and # END lines?

I need to do this in same command without pipes, "!p" or "p" versions it has to be this "d" modified version. Tried things like:

sed -e '/^#[^#][^S][^T][^A][^R][^T]/d'

or

sed -e '/^#[^#]\([^S][^T][^A][^R][^T]\|[^E][^N][^D]\)/d'

but nothing is working the way I want. I'm not sure if this is possible this way.

Expected output:

# START
group1: user1@domain.com, user2@domain.com, user3@domain.com
group2: user3@domain.com, user4@domain.com

## Owner1
group3: user1@domain.com, user3@domain.com

## Owner2
group4: user4@domain.com, user3@domain.com

group3: user2@domain.com, user3@domain.com

# END

group5: user4@domain.com
alias1: user6@domain.com

Greetings & thanks for help :)

Please give more context of the code to demonstrate the restrictions you impose. I.e. show some dummy "additional -e". — Yunnosch, Mar 14 '18 at 21:47
Hoe about using awk/gawk instead of sed? It might be a tad easier and more readable to fulfill your criteria that way. — 0x01, Mar 14 '18 at 21:47
Maybe show some versions of your coding attempts which basically do as required but file your side requirements. Ideally show what does not work with them because of that. — Yunnosch, Mar 14 '18 at 21:48
I've overseen the 'no p'-requierement. Can you explain what is problematic with it? — user unknown, Mar 14 '18 at 22:06

score 3 · Accepted Answer · answered Mar 14 '18 at 21:55

3

Try:

sed -E '/^##|^# START|^# END/bskip; /^#/d; :skip' file

Example

$ sed -E '/^##|^# START|^# END/bskip; /^#/d; :skip' file
# START
group1: user1@domain.com, user2@domain.com, user3@domain.com
group2: user3@domain.com, user4@domain.com

## Owner1
group3: user1@domain.com, user3@domain.com

## Owner2
group4: user4@domain.com, user3@domain.com

group3: user2@domain.com, user3@domain.com

# END

group5: user4@domain.com
alias1: user6@domain.com

How it works

/^##|^# START|^# END/bskip

For any line that matches ^## or ^# START or ^# END, we branch to the label skip.
/^#/d

For all other lines that start with #, we delete.
:skip

This defines the label skip.

BSD/macOS

The above was tested with GNU sed. For BSD/macOS sed, try:

sed -E -e '/^##|^# START|^# END/bskip' -e '/^#/d' -e ':skip' file

answered Mar 14 '18 at 21:55

John1024

109,961
14
137
171

Hi, seems that's working but tell me this -E this seems different than -e and maybe won't be supported on all linux systems? I guess this is only for more clean "OR" = "|"? Seems this is working too: sed -e '/^##\|^# START\|^# END/bskip; /^#/d; :skip'. Thanks! – mike Mar 15 '18 at 08:05
@mike Yes, that is exactly right: `-E` invokes Extended Regular Expressions which, in our case, allows `|` in place of `\|`. And, yes, your version without `-E` will work just as well. – John1024 Mar 15 '18 at 17:41
i think that `-r` does the job on linux. Thanks :) – mike Mar 15 '18 at 20:58
Yes, that is correct. `-r` will work with both modern and ancient GNU sed. However, POSIX appears to have settled on `-E` rather than `-r` for extended regular expressions. Consequently, all modern versions of GNU sed, also accept `-E` to enable extended regexes. On BSD/macOS, only `-E` is accepted. – John1024 Mar 15 '18 at 21:04

user unknown · Answer 2 · 2018-03-14T22:18:00.003

This is more verbose than John1024s answer, but works too:

sed -r 's/# ((START)|(END)).*/## \1/;/^#[^#].*/d;s/## ((START)|(END))/# \1/;' sample.conf

Transfer the # START/END comment to the protected ## format, then, do the transformation, then transform it to # START/END back.

First I've overseen the 'no-/p'-requierement, then the obvious solution is:

sed -r '/# (START)|(END).*/p;/^#[^#].*/d' sample.conf

Instead of deleting a complicated delete-pattern /d, you can use a simple print-pattern /p.

Note that [^S][^T][^A][^R][^T] would match "END" (with 2 trailing spaces - maybe unlikely, but if another 3- or 5-letter exception needs treatment, it gets ugly, if it isn't alread.

SED delete comments but leave specific negated comment strings

2 Answers2

Example

How it works

BSD/macOS