Assuming your large file is simple unstructured text, i.e. this is no good for structured text like JSON, here's an alternative to reading every single line: read large binary bites of the input file until at your chunksize then read a couple of lines, close the current output file and move on to the next.
I compared this with line-by-line using @tdelaney code adapted with the same chunksize as my code - that code took 250s to split a 12GiB input file into 6x2GiB chunks, whereas this took ~50s so maybe five times faster and looks like it's I/O bound on my SSD running >200MiB/s read and write, where the line-by-line was running 40-50MiB/s read and write.
I turned buffering off because there's not a lot of point. The size of bite and the buffering setting may be tuneable to improve performance, haven't tried any other settings as for me it seems to be I/O bound anyway.
import time
outfile_template = "outfile-{}.txt"
infile_name = "large.text"
chunksize = 2_000_000_000
MEB = 2**20 # mebibyte
bitesize = 4_000_000 # the size of the reads (and writes) working up to chunksize
count = 0
starttime = time.perf_counter()
infile = open(infile_name, "rb", buffering=0)
outfile = open(outfile_template.format(count), "wb", buffering=0)
while True:
byteswritten = 0
while byteswritten < chunksize:
bite = infile.read(bitesize)
# check for EOF
if not bite:
break
outfile.write(bite)
byteswritten += len(bite)
# check for EOF
if not bite:
break
for i in range(2):
l = infile.readline()
# check for EOF
if not l:
break
outfile.write(l)
# check for EOF
if not l:
break
outfile.close()
count += 1
print( count )
outfile = open(outfile_template.format(count), "wb", buffering=0)
outfile.close()
infile.close()
endtime = time.perf_counter()
elapsed = endtime-starttime
print( f"Elapsed= {elapsed}" )
NOTE I haven't exhaustively tested this doesn't lose data, although no evidence it does lose anything you should validate that yourself.
Might be useful to add some robustness by checking when at the end of a chunk to see how much data is left to read, so you don't end up with the last output file being 0-length (or shorter than bitesize)
HTH
barny