Let's say I have two input data files. The first looks like this:
1 0.00038 0.75053 0.50 35 6000 0.75346
2 0.00038 0.75053 0.50 35 6050 0.72079
3 0.00038 0.75053 0.50 35 6100 0.69229
4 0.00038 0.75053 0.50 35 6150 0.66689
5 0.00038 0.75053 0.50 35 6200 0.64382
6 0.00038 0.75053 0.50 35 6250 0.62269
7 0.00038 0.75053 0.50 35 6300 0.60313
8 0.00038 0.75053 0.50 35 6350 0.58481
9 0.00038 0.75053 0.50 35 6400 0.56756
10 0.00038 0.75053 0.50 35 6450 0.55122
And the second one looks like this:
1 -0.123 -0.306 inf 1.043 0.000 0.010 0.000 0.653 0.000 0.091 0.000 0.009 0.000 3.097 0.000 0.137 0.002
2 -0.142 -0.170 inf 1.035 0.000 0.064 0.000 0.538 0.000 0.560 0.000 0.289 0.000 3.168 0.000 6.182 0.000
3 -0.160 -0.143 inf 1.027 0.000 0.086 0.000 0.401 0.000 0.631 0.000 0.400 0.000 3.348 0.000 0.130 0.000
4 -0.176 -0.117 inf 1.020 0.000 0.107 0.000 0.249 0.000 0.592 0.000 0.435 0.000 3.526 0.000 0.402 0.001
5 -0.191 -0.110 inf 1.014 0.000 0.133 0.000 0.091 0.000 0.514 0.000 0.425 0.000 3.644 0.001 0.598 0.001
6 -0.206 -0.099 inf 1.008 0.000 0.162 0.000 6.247 0.000 0.435 0.001 0.392 0.001 3.675 0.001 0.707 0.002
7 -0.220 -0.093 0.976 1.003 0.000 0.194 0.000 6.168 0.001 0.377 0.001 0.352 0.001 3.602 0.003 0.740 0.003
8 -0.233 -0.092 inf 0.999 0.000 0.226 0.000 6.137 0.001 0.353 0.001 0.302 0.001 3.445 0.004 0.712 0.005
9 -0.246 -0.124 inf 0.996 0.000 0.258 0.000 6.145 0.001 0.363 0.001 0.252 0.001 3.242 0.004 0.620 0.006
10 -0.259 -0.119 inf 0.994 0.000 0.289 0.000 6.172 0.001 0.393 0.001 0.206 0.001 3.028 0.005 0.456 0.008
Now, as you can see, there appears to be 2 populations
in this graph, no? I would like to find out which rows in the 2nd file correlate to the different populations. What would be the best way to do this?
If you would like to reproduce this yourself here is the first input file and the second input file.