I am trying to find all the places where my data has a repeating line and delete the repeating line. Also, I am looking for where the 2nd column has the value 90 and replace the following 2nd column with a specific number I designate.
My data looks like this:
# Type Response Acc RT Offset
1 70 0 0 0.0000 57850
2 31 0 0 0.0000 59371
3 41 0 0 0.0000 60909
4 70 0 0 0.0000 61478
5 31 0 0 0.0000 62999
6 41 0 0 0.0000 64537
7 41 0 0 0.0000 64537
8 70 0 0 0.0000 65106
9 11 0 0 0.0000 66627
10 21 0 0 0.0000 68165
11 90 0 0 0.0000 68700
12 31 0 0 0.0000 70221
I want my data to look like:
# Type Response Acc RT Offset
1 70 0 0 0.0000 57850
2 31 0 0 0.0000 59371
3 41 0 0 0.0000 60909
4 70 0 0 0.0000 61478
5 31 0 0 0.0000 62999
6 41 0 0 0.0000 64537
8 70 0 0 0.0000 65106
9 11 0 0 0.0000 66627
10 21 0 0 0.0000 68165
11 90 0 0 0.0000 68700
12 5 0 0 0.0000 70221
My code:
BEGIN {
priorline = "";
ERROROFFSET = 50;
ERRORVALUE[10] = 1;
ERRORVALUE[11] = 2;
ERRORVALUE[12] = 3;
ERRORVALUE[30] = 4;
ERRORVALUE[31] = 5;
ERRORVALUE[32] = 6;
ORS = "\n";
}
NR == 1 {
print;
getline;
priorline = $0;
}
NF == 6 {
brandnewline = $0
mytype = $2
$0 = priorline
priorField2 = $2;
if (mytype !~ priorField2) {
print;
priorline = brandnewline;
}
if (priorField2 == "90") {
mytype = ERRORVALUE[mytype];
}
}
END {print brandnewline}
##Here the parameters of the brandnewline is set to the current line and then the
##proirline is set to the line on which we just worked on and the brandnewline is
##set to be the next new line we are working on. (i.e line 1 = brandnewline, now
##we set priorline = brandnewline, thus priorline is line 1 and brandnewline takes
##on line 2) Next, the same parameters were set with column 2, mytype being the
##current column 2 value and priorField2 being the same value as mytype moves to
##the next column 2 value. Finally, we wrote an if statement where, if the value
##in column 2 of the current line !~ (does not equal) value of column two of the
##previous line, then the current line will be print otherwise it will just be
##skipped over. The second if statement recognizes the lines in which the value
##90 appeared and replaces the value in column 2 with a previously defined
##ERRORVALUE set for each specific type (type 10=1, 11=2,12=3, 30=4, 31=5, 32=6).
I have been able to successfully delete the repeating lines, however, I am unable to execute the next part of my code, which is to replace the values I designated in BEGIN as the ERRORVALUES (10=1, 11=2, 12=3, 30=4, 31=5, 32=6) with the actual columns that contain that value. Essentially, I want to just replace that value in the line with my ERRORVALUE.
If anyone can help me with this I would be very grateful.