I'm currently writing a python script that processes very large (> 10GB) files. As loading the whole file into memory is not an option, I'm right now reading and processing it line by line:
for line in f:
....
Once the script is finished it will run fairly often, so I'm starting to think about what impact that sort of reading will have on my disks lifespan.
Will the script actually read line by line or is there some kind of OS-powered buffering happening? If not, should I implement some kind of intermediary buffer myself? Is hitting the disk that often actually harmful? I remember reading something about BitTorrent wearing out disks quickly exactly because of that kind of bitwise reading/writing rather than operating with larger chunks of data.
I'm using both a HDD and an SSD in my test environment, so answers would be interesting for both systems.