I'm trying to, I think, replicate the cat
functionality of the Linux shell in a platform-agnostic way such that I can take two text files and merge their contents in the following manner:
file_1 contains:
42 bottles of beer on the wall
file_2 contains:
Beer is clearly the answer
Merged file should contain:
42 bottles of beer on the wall
Beer is clearly the answer
Most of the techniques I've read about, however, end up producing:
42 bottles of beer on the wallBeer is clearly the answer
Another issue is that the actual files with which I'd like to work are incredibly large text files (FASTA formatted protein sequence files) such that I think most methods reading line-by-line are inefficient. Hence, I have been trying to figure out a solution using shutil
, as below:
def concatenate_fasta(file1, file2, newfile):
destination = open(newfile,'wb')
shutil.copyfileobj(open(file1,'rb'), destination)
destination.write('\n...\n')
shutil.copyfileobj(open(file2,'rb'), destination)
destination.close()
However, this produces the same problem as earlier except with "..." in between. Clearly, the newlines are being ignored but I'm at a loss with how to properly manage it.
Any help would be most appreciated.
EDIT:
I tried Martijn's suggestion, but the line_sep
value returned is None
, which throws an error when the function attempts to write that to the output file. I have gotten this working now via the os.linesep
method mentioned as less-optimal as follows:
with open(newfile,'wb') as destination:
with open(file_1,'rb') as source:
shutil.copyfileobj(source, destination)
destination.write(os.linesep*2)
with open(file_2,'rb') as source:
shutil.copyfileobj(source, destination)
destination.close()
This gives me the functionality I need, but I'm still at a bit of a loss as to why the (seemingly more elegant) solution is failing.