1

I'm trying to merge multiple .txt files that are in a .zip file into only one .txt file in Python.

My code is the following:

firstfile = Path(r'C:\Users\Viniz\Downloads\devkmbe-5511001_05-12-2022_00_20_09.zip\AudioCaptureMemoryUsage_01_12_2022.txt')
secondfile = Path(r'C:\Users\Viniz\Downloads\devkmbe-5511001_05-12-2022_00_20_09.zip\AudioMatchingMemoryUsage_01_12_2022.txt')

newfile = input("Enter the name of the new file: ")
print()
print("The merged content of the 2 files will be in", newfile)

with open(newfile, "wb") as wfd:
    for f in [firstfile, secondfile]:
        with open(f, "rb") as fd:
            shutil.copyfileobj(fd, wfd, 1024 * 1024 * 10)

print("\nThe content is merged successfully.!")
print("Do you want to view it ? (y / n): ")

check = input()
if check == 'n':
    exit()
else:
    print()
    c = open(newfile, "r")
    print(c.read())
    c.close()



Thanks.

I tried to merge them in only one file but it doesn't worked.

ViniPonce
  • 13
  • 2

2 Answers2

1

To merge the files, you'll need to first extract the files from the zip file, then merge them, and then write the merged content to a new file. Here is an example of how you can do this using the zipfile module.

Update: If the .txt files are located inside a folder within the zip file, you'll need to include the folder name in the path when opening the files.

import zipfile

zip_file = r'C:\Users\Viniz\Downloads\devkmbe-5511001_05-12-2022_00_20_09.zip'
folder_name = 'myfolder'
first_file = folder_name + '/AudioCaptureMemoryUsage_01_12_2022.txt'
second_file = folder_name + '/AudioMatchingMemoryUsage_01_12_2022.txt'

with zipfile.ZipFile(zip_file, 'r') as zip_ref:
    with zip_ref.open(first_file) as f1, zip_ref.open(second_file) as f2:
        first_content = f1.read()
        second_content = f2.read()

    # Concatenate the two files
    merged_content = first_content + second_content
    
    # Write the merged content to a new file
    new_file = input("Enter the name of the new file: ")
    with open(new_file, 'wb') as new_f:
        new_f.write(merged_content)
        
    print("The content is merged successfully.!")
    print("Do you want to view it ? (y / n): ")

    check = input()
    if check == 'n':
        exit()
    else:
        print()
        c = open(new_file, "r")
        print(c.read())
        c.close()

Make sure to replace 'myfolder' with the actual name of the folder containing the .txt files in your zip file.

For multiple files..

import zipfile

zip_file = r'C:\Users\Viniz\Downloads\devkmbe-5511001_05-12-2022_00_20_09.zip'
folder_name = 'myfolder'
file_names = ['AudioCaptureMemoryUsage_01_12_2022.txt',
              'AudioMatchingMemoryUsage_01_12_2022.txt',
              'File3.txt',
              'File4.txt',
              ...
              'File29.txt']

merged_content = b''  # Initialize an empty bytes object
with zipfile.ZipFile(zip_file, 'r') as zip_ref:
    for file_name in file_names:
        with zip_ref.open(folder_name + '/' + file_name) as f:
            merged_content += f.read()
            
    # Write the merged content to a new file
    new_file = input("Enter the name of the new file: ")
    with open(new_file, 'wb') as new_f:
        new_f.write(merged_content)
        
    print("The content is merged successfully.!")
    print("Do you want to view it ? (y / n): ")

    check = input()
    if check == 'n':
        exit()
    else:
        print()
        c = open(new_file, "r")
        print(c.read())
        c.close()
Gihan
  • 3,144
  • 2
  • 9
  • 36
  • When trying to use your code I get this error: Traceback (most recent call last): File "C:\Users\Viniz\PycharmProjects\ProjetoPY\zipprojeto.py", line 76, in with zip_ref.open(first_file) as f1, zip_ref.open(second_file) as f2: ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Viniz\AppData\Local\Programs\Python\Python311\Lib\zipfile.py", line 1544, in open zinfo = self.getinfo(name) ^^^^^^^^^^^^^^^^^^ File "C:\Users\Viniz\AppData\Local\Programs\Python\Python311\Lib\zipfile.py", line 1473, in getinfo raise KeyError( – ViniPonce Jan 26 '23 at 14:43
  • plus: KeyError: "There is no item named 'AudioCaptureMemoryUsage_01_12_2022.txt' in the archive" – ViniPonce Jan 26 '23 at 14:45
  • Make sure that the path to the zip file is correct and also the file names that you are trying to extract are correct. – Gihan Jan 26 '23 at 14:48
  • You can check what files are in the archive using the ZipFile.namelist() method, like this: with zipfile.ZipFile(zip_file, 'r') as zip_ref: print(zip_ref.namelist()) – Gihan Jan 26 '23 at 14:49
  • Hmm, i forgot to mention that when you open the zip, there is a folder before the .txt files, what else should I do regarding that? – ViniPonce Jan 26 '23 at 14:52
  • Try now I've updated the code. – Gihan Jan 26 '23 at 15:00
  • Many thanks bro, u saved my job! Do you know the best way to do that with another more 29 different logs? I create third_file (...) 29_file with the names? – ViniPonce Jan 26 '23 at 15:02
  • Yes, you can create variables for each of the files and add them to a list, then use a loop to iterate through the list and extract and merge the contents of the files. – Gihan Jan 26 '23 at 15:06
  • Hmm, got it.. Also, I have the same log being executed in different days (varies according to the day the log was extracted), its there any way to make it work? – ViniPonce Jan 26 '23 at 16:20
  • Bro, how do I make this dynamic? Like I have logs from a lot of days, but this deppends according to when you pick those logs, so how I do this dynamic? – ViniPonce Jan 27 '23 at 16:25
0
import os
import zipfile
import shutil

def extract_txt_files(zip_path, temp_folder):
    """Extracts all the .txt files from the given zip file to the given temp folder"""
    with zipfile.ZipFile(zip_path, "r") as zip_file:
        i = len([name for name in os.listdir(temp_folder) if name.endswith(".txt")]) + 1
        for member in zip_file.infolist():
            if member.filename.endswith(".txt"):
                zip_file.extract(member, temp_folder)
                os.rename(os.path.join(temp_folder, member.filename), os.path.join(temp_folder, f"{i}.txt"))
                i += 1

def merge_txt_files(temp_folder):
    """Merges all the .txt files from the given temp folder into a single file called "merged.txt" """
    with open("merged.txt", "w") as outfile:
        for filename in os.listdir(temp_folder):
            if filename.endswith(".txt"):
                with open(os.path.join(temp_folder, filename)) as infile:
                    outfile.write(infile.read())

def delete_temp_folder(temp_folder):
    """Deletes the given temp folder"""
    os.rmdir(temp_folder)

# paths to the zip files
zip1_path = "zip1.zip"
zip2_path = "zip2.zip"

# create a temporary folder to extract the .txt files
temp_folder = "temp"
os.makedirs(temp_folder, exist_ok=True)

# extract the .txt files from the zip files
extract_txt_files(zip1_path, temp_folder)
extract_txt_files(zip2_path, temp_folder)

# merge the .txt files
merge_txt_files(temp_folder)

# delete the temporary folder
shutil.rmtree(temp_folder)


print("The content is merged successfully.!")
    print("Do you want to view it ? (y / n): ")

    check = input()
    if check == 'n':
        exit()
    else:
        print()
        c = open(new_file, "r")
        print(c.read())
        c.close()

The zip path in the script is relative, which means that the zip files "zip1.zip" and "zip2.zip" are expected to be in the same directory as the script.

If the zip files contain multiple .txt files, the script will extract all of them to the temporary folder.

the script renames the extracted .txt files with an incremental index and the .txt extension to ensure that all the extracted files will have unique names and not overwritten.This will maintain the order of txt files as they were in zip file.