I have two tables, the start of each is given below:
Table 1: All SNPs
SNp Gene
rs1798922 ENSG00000167634
rs4677723 ENSG00000167634
rs1609823 ENSG00000104450
rs11597390 ENSG00000104643
rs7824557 ENSG00000104643
rs1371867 ENSG00000104450
Table 2: Best SNP per gene
SNP Gene
rs1371867 ENSG00000104450
rs7824557 ENSG00000104643
rs1671152 ENSG00000167634
rs11597390 ENSG00000095485
rs285757 ENSG00000185442
Table 1 shows a list of genes with their corresponding SNPs. As can be seen, the same gene is repeated in many places in the table.
Table 2 is the result after filtering through all the SNPs for each gene in Table 1, and keeps only one SNP per gene (keeps best SNP according to the p-value, although that's not relevant here).
So in other words, there are some SNPs in Table 1 that's not included in Table 2, since Table 2 only keeps the best SNP for each gene.
For each gene, I want to use R to compare the 2 tables and output back the SNPs that weren't included in Table 2 for that gene. So the specification for comparison is the Gene name, which will change constantly since there are many genes in the table.