I asked a question here about how to read in a very large file to python, and I got a response based on zip_longest.
The problem is that this solution is extremely slow - it took keras' model.predict >2 hours to process 200,000 lines in a file which normally takes <3 minutes when the file is loaded directly into memory, and I want to be able to process files 5x this size.
I've since found the chunking functions in pandas but I don't understand how to load a chunk of a file, reshape the data and then use it using these methods, and I also don't know if this will be the fastest way of reading and using the data in a very large file.
Any fast solutions to this problem are welcome.