How to delete all lines of a text file that contain ./

Question

I have text files with data that looks like this

chrom	start	end	gene	mutation
chr1	12756	12790	DVL1	T/C
chr1	12856	12890	DVL2	./.
chr1	12956	12990	DVL3	T/C

I need to delete all the lines that contain ./. in them, the files are around 500 lines so I don't need anything super efficient.

I've tried a bunch of different approaches with no success, both to cut out the "./." and to cut out the lines that don't contain "./." in the final column.

grep -v "./." input.txt > output.txt

awk '/"./."/' input.txt > tmpfile && mv tmpfile output.txt

fgrep -xv "./." input.txt > output.txt

awk -F',' '$5 !~ "./." {print $0}' input.txt > output.txt

awk 'BEGIN { OFS=FS="\t" } $5 !~ /^('./.')/' input.txt > output.txt

awk '!"./." ' input.txt > output.txt

sed -i '"./."d' input.txt > output.txt

I feel like I'm close but just can't see what i'm missing, any help is appreciated.

`grep -v '\./\.'`, escape the dots. Or use `grep -vF './.' `. See https://ideone.com/IcjM5Z — Wiktor Stribiżew, Jun 10 '21 at 13:00
@WiktorStribiżew, OP is looking to ignore those lines, dupe given here is finding the lines with dots IMHO. — RavinderSingh13, Jun 10 '21 at 13:02
@RavinderSingh13 The only issue about escaping special char, a dot. Or using the `-F` option. Everything is covered in that post. In general, a dupe of [What special characters must be escaped in regular expressions?](https://stackoverflow.com/questions/399078/what-special-characters-must-be-escaped-in-regular-expressions), but I tried to find a `grep`-oriented post here — Wiktor Stribiżew, Jun 10 '21 at 13:03
@WiktorStribiżew, I agree somewhat few things are covered, but when we have a full fledge answer then why to go for partial dupe. I am fine if we put a exact dupe to make this dupe. — RavinderSingh13, Jun 10 '21 at 13:04
@kellogg76, with `awk`, you can use string comparison instead of regex `'$NF != "./."'` — Sundeep, Jun 10 '21 at 13:05
This `both to cut out the "./." and to cut out the lines that don't contain "./." in the final column` means don't print the line if there is `./.` in it right? Regardless of it being in the last column, and the result should be the 1st and the 3rd line? — The fourth bird, Jun 10 '21 at 13:14
With your artistic rendering of what the input text looks like, we can't tell which of these is correct. `awk -F ','` would work if the input is comma-separated; `awk -F '\t'` would be correct if it's tab-separated. Either of those should have worked if you had used a correct regex, but we can't tell without further details. — tripleee, Jun 10 '21 at 13:27
You showed us all the things you tried, but you didn't explain what didn't work. — Andy Lester, Jun 10 '21 at 13:48
[edit] your question to show concise, testable, plain-text sample input and expected output so we can help you. No images, no links, no artistic tables, just raw text that we can copy/paste as-is to test a potential solution with. — Ed Morton, Jun 10 '21 at 15:08

RavinderSingh13 · Answer 1 · 2021-06-10T13:06:26.403

1

Use this simple grep, simply use -v option to ignore given pattern lines.

grep -v '\./\.' Input_file

OR in awk try following:

awk '$NF=="./."{next} 1' Input_file

edited Jun 10 '21 at 13:06

answered Jun 10 '21 at 13:01

RavinderSingh13

130,504
14
57
93

tripleee · Answer 2 · 2021-06-10T13:43:39.927

We can't tell from your input data what your input file looks like. If it's tab-delimited, tell Awk to split on a tab:

awk -F '\t' '$5 != "./."' input.txt >output.txt

If you have a comma-delimited input file, the corresponding command would look like

awk -F ',' '$5 != "./."' input.txt >output.txt

The != string inequivalence operator is simpler to use than a regular expression here. We are simply saying "print lines where the fifth column is not exactly this string."

The corresponding regex would look like

awk -F ',' '$5 !~ /^\.\/\.$/' input.txt >output.txt

but you would obviously like to avoid the leaning toothpicks syndrome here.

In some more detail,

grep -v "./." is wrong because it removes any line with a slash with any character at all on either side. You can fix this by escaping the dots with a backslash or character class; grep -v '\./[.]' demonstrates both. This is still wrong in that it looks for the pattern anywhere, not just in the last field; but if you don't expect matches in other fields, maybe that's good enough.
awk '/"./."/' looks for a slash surrounded by literal double quotes on both sides, with any character in between.
fgrep -xv "./." is otherwise good, but the -x option limits the expression to only match lines which contain nothing else than the pattern.
awk -F',' '$5 !~ "./." {print $0}' would work for a comma-delimited file if you fix the regex. The { print $0 } is redundant but harmless.
awk 'BEGIN { OFS=FS="\t" } $5 !~ /^('./.')/' holds some promise for tab-delimited files, but the regex is hopelessly botched. The single quotes inside the regex are wrong, but will happily not break the syntax of the script; they will basically disappear before Awk processes the script because quotes are handled by the shell ... Long story short, read up on shell quoting.
awk '!"./." ' will not do anything; it says to print if the static string in the condition is empty, which it isn't.
sed -i '"./."d' input.txt > output.txt is wrong because the -i option will make changes to input.txt and not print anything to standard output; the regex is flawed both because of the quoting problems and because it needs to be surrounded by valid regex delimiters. sed '/\.\/\./d' input.txt > output.txt would work similarly to the first grep example above.

How to delete all lines of a text file that contain ./

2 Answers2