I have around 600 .txt files in this format
Position SRR7622449
chr1_944296 1
chr1_944307 1
chr1_946247 1
chr1_1014274 1
chr1_1401954 1
chr1_1541864 1
Each file has 2 columns, a column called Position which exists in every file; a second column which denotes the identifier and is different in every file.
The number of rows in every file is different.
I wish to merge all of these 600+ files into one dataframe and add whichever values are duplicated so that in the end I have unique rows.
This is what I tried first
require(readr)
require(dplyr)
require(tidyr)
files <- dir(pattern = "*.txt")
data <- files %>% map(read_tsv) %>% bind_rows()
This gave me a huge dataframe and i found out that the Position column is now duplicated. I want this result:
Position SRR7622449 SRR7622450
chr1_944296 2 1
instead of
Position SRR7622449 SRR7622450
chr1_944296 1 NA
chr1_944296 1 1
When I try
data %>% group_by(Position) %>% summarise_each(funs(max))
I seem to be losing values. What do I do to fix this?