I'm facing a problem to read a file with txt format. The file contains a huge amount of data (88604154 lines, 2695.7893953323364 MB) and I have to analyze the data then plot a histogram of them.
The problem is that it takes ages for the computer to read that much data, so I thought I could read the data partly and add the parts together. I did a little search and came up with this code:
import resource
file_name = '/home/lam/Downloads/C3--Trace--00001.txt'
lines_num = []
for i in range(1,50001):
lines_num.append(i)
with open (r"/home/lam/Downloads/C3--Trace--00001.txt", 'r') as fp:
lines = []
for i, line in enumerate(fp):
if i in lines_num:
lines.append(line.strip())
elif i > 50001:
break
txt_file.close()
With this I can have the lines in the certain amount (for example from line one to 50000), but I want to repeat the code for like 1775 times in order to read all the data and then append them all in one list. How can I write a function for this?