delete lines in file not matching the pattern

Question

I am trying to migrate data which consists of a lot of separate text files. One step is to delete all lines in the text files, which are not used anymore. The lines are key-value-pairs. I want to delete everything in a file except those lines with certain keys. I do not know the order of the keys inside of the file.

The keys I want to keep are e.g. version, date and number.

I found this question Remove all lines except matching pattern line best practice (sed) and tried the accepted answer. My sed command is

sed '/^(version=.*$)|(date=.*$)|(number=.*$)/!d' file.txt

with a !d after the address to delete all lines NOT matching the pattern.

Example of the regex: https://regex101.com/r/LKfxpP/2

but it keeps deleting all lines in my file. Where is my mistake? I assume I am wrong with my regex, but whats the error here?

score 1 · Accepted Answer · answered Nov 09 '18 at 12:23

You may use

sed '/^\(version\|date\|number\)=/!d' file.txt > newfile.txt

The BRE POSIX pattern here matches

^ - start of a line
\(version\|date\|number\) - a group matching
- version - a version string
- \| - or
- date - a date string
- \| - or
- number - a number string
= - a = char.

Or, use a POSIX ERE syntax enabled with -E option:

sed -E '/^(version|date|number)=/!d' file.txt > newfile.txt

Here, the alternation operator | and capturing parentheses do not need escaping.

See an online demo.

Thank you for your answer. The different kinds of pattern are the part I was missing. The POSIX ERE syntax is exactly what I can use in my case. — htz, Nov 09 '18 at 13:54

score 1 · Answer 2 · answered Nov 09 '18 at 12:57

1

Using awk:

awk -F= '$1 !~ /version|date|number/' file.txt

The field separator is set to = and the first field must not match the given string.

answered Nov 09 '18 at 12:57

oliv

12,690
25
45

delete lines in file not matching the pattern

2 Answers2