I wrote a Python script that reads a 1GB json file and writes the contents to multiple files. It ran terribly slow on my work laptop, taking almost 30 min to complete. Colleagues suggested it may be due to the anti virus scanning the new files created by the script. I added my projects folder as well as python.exe process to the Exclusions list in Windows Security settings (Virus & threat protection). That brought down the execution time to 5 min, still slow. The same script takes 16 seconds to run on my personal laptop (with lower specs). I'm not using a virtual environment on either. Are there other python-related folders or processes that I need to add to the exclusion list? I'd like to solve this before I can start working with much larger files.
I notice these two Windows processes below spiking up when I run the code or any other code that creates and writes to files.
Work laptop details:
- python3.9
- IDE: VSCode
- Processor Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz 2.11 GHz
- Installed RAM 32.0 GB (31.8 GB usable)
- System type 64-bit operating system, x64-based processor
Adding my python code just in case.
import orjson as json
import time
def json_generator():
with open(\'json_file.json') as file:
for line in file:
yield json.loads(line)
def main():
_start = time.perf_counter()
payload = json_generator()
file_num = 1
for num, item in enumerate(payload, 1):
file_json = open(fr'./output/{file_num}.json', 'ba+')
file_json.write(json.dumps(item))
file_json.write(b'\n')
if num % 10_000 == 0:
file_json.close()
file_num += 1
print('time elapsed...', time.perf_counter()-_start)
if __name__ == "__main__":
main()
I've tried researching this, but most entries describe adding your projects folder to the Exclusions list, which I have already done.