I have a coding problem beyond my limited skills with unix power tools. I'm looking to count the number of sample with either: i) a homozygous variant in a gene (BB below); or ii) two variants in a gene (2x AB). For example, from:
Variant Gene Sample1 Sample2 Sample3
1 TP53 AA BB AB
2 TP53 AB AA AB
3 TP53 AB AA AA
4 KRAS AA AB AA
5 KRAS AB AB BB
I'm looking for:
Gene Two_variants Homozygous Either
TP53 2 1 3
KRAS 1 1 2
Any help would be much appreciated. Thanks.
R_G