I'm trying to find a match of a ~500 character long DNA sequence from a few megabyte large CSV file containing different sequences. Before each sequence in the CSV file, there is some metadata I would like to have. Each sequence and sequence metadata take up exactly one line. I've tried
grep -B 1 "extremelylongstringofDNATACGGCATAGAGGCCGAGACCTAGGATTAACGTTACTGACGAT" csvfile.csv
However that returns filename too long
An interesting and frustrating thing I bumped into was when I tried to find the line count of the csv file by using
wc -l csvfile.csv
it returned
0 csvfile.csv
And without the -l
flag, it returned
0 161410 41507206 csvfile.csv
This is the result even after I added a line between the end of each sequence and the start of the following metadata of the next sequence.