Hi stackoverflow Community!
I have the set of data:
0 A 0.000027769231 1 B 0.000030287440 0.628306 0.988151 1
0 A 0.000027479497 2 C 0.000035937793 0.581428 0.976041 1
1 B 0.000030287440 2 C 0.000035532483 0.516033 0.987388 1
4 D 0.000011085990 5 E 0.000008163211 0.577556 0.943583 1
4 D 0.000010787916 8 F 0.000008873166 0.531686 0.954017 1
5 E 0.000007865264 8 F 0.000008873166 0.691516 0.989945 1
311 G 0.000006216949 312 H 0.000002510852 0.829361 0.983148 1
326 M 0.000028129783 327 N 0.000011022112 0.843188 0.915627 1
326 M 0.000027462953 328 O 0.000002167529 1.742349 0.943267 1
326 M 0.000028024026 329 P 0.000005130416 1.263187 0.924010 1
326 M 0.000027630314 330 R 0.000002965539 1.668906 0.935518 1
326 M 0.000027721668 331 S 0.000002614498 1.851544 0.939051 1
326 M 0.000028129332 332 T 0.000003145471 1.742525 0.930186 1
327 N 0.000011020065 328 O 0.000002570277 2.473902 0.943474 1
327 N 0.000011028065 329 P 0.000005235456 1.447848 0.976569 1
327 N 0.000011032158 330 R 0.000003154471 2.303768 0.955479 1
327 N 0.000011025788 331 S 0.000002864823 2.038783 0.946972 1
327 N 0.000011064135 332 T 0.000003183160 1.213611 0.975056 1
328 O 0.000002505234 329 P 0.000005129224 1.549313 0.968629 1
328 O 0.000002452331 330 R 0.000002965465 2.328536 0.981076 1
329 P 0.000005147180 330 R 0.000003095314 2.803627 0.977268 1
329 P 0.000005208069 332 T 0.000003147536 2.658807 0.984912 1
330 R 0.000002967887 331 S 0.000002700052 1.208673 0.987825 1
330 R 0.000003110114 332 T 0.000003145140 2.428988 0.983747 1
331 S 0.000002853757 332 T 0.000003145464 1.551457 0.982276 1
366 I 0.000000326315 367 J 0.000000253986 1.410176 0.961879 1
366 I 0.000000327483 368 K 0.000000110327 1.236265 0.918510 1
366 I 0.000000326939 369 Q 0.000000165208 2.258098 0.907039 1
367 J 0.000000257330 368 K 0.000000113511 2.600934 0.907874 1
367 J 0.000000256872 369 Q 0.000000166861 1.102368 0.937099 1
In each row I have an unique pair of some elements that I indicated here as a letters. I want to create groups of these elements and choose the largest value from column 3 or 6 in each group. For this dataset I should get 4 groups with elements and max value from column 3 or 6:
A
B
C
maxval: C: 0.000035937793
D
E
F
maxval: D: 0.000011085990
G
H
maxval: G: 0.000006216949
M
N
O
P
R
S
T
maxval: M: 0.000028129783
I
J
K
Q
maxval: I: 0.000000326939
As you can notice, if in rows there are more than one the same element (e.g. A), values in column 3 (for A) are a little bit different. However, we can make an assumption that A has the same value of column 3 in every cases.
As an output I want to get three files:
- list of groups with maxval of column 3 or 6
- list of elements with the largest value from column 3 or 6. I want also add 1 or 4 column for every elements:
2 C
4 D
311 G
326 M
366 I
- list with other elements from every groups:
0 A
1 B
5 E
8 F
312 H
327 N
328 O
329 P
330 R
331 S
332 T
367 J
368 K
369 Q
I have no idea how to do such a case in Python. Can anyone help me with some advices or parts of code?