I am experimenting with QPlaintextedit widget as a text editor, so far it works great, using it I can type large amounts of text and the UI doesn't freeze or stutter. I thought I would push the boundaries and see what happens.
The basic gist is that using my editor i can write pseudo code, and then parse the code for mistakes. If there are no mistakes the parse spits out some xml based on the input text. In the end I get a nice xml document describing the text. Essentially I have managed to transform pseudo code into an xml file.
This works reasonably well, but the more text in the editor the more memory it uses. Now I managed to paste about 750k lines of text into my editor, when it came time to parse it, I first read the text and then send the entire text to the parser. To that end I do:
editor_text=QPlainTextEdit.toPlainText()
This gives me all the text in the editor which I can send to the parser and then convert it to and xml file (if no mistakes are found)
Now with 750k lines of text in the editor the toPlainText() method doesn't work so well, in fact I just run out of memory.
My question is how should I deal with really large amounts of text in order to parse it.
One thing I have thought about (not tried) is reading the text block by block (or line by line) parsing each line and converting it to xml, but I'd still have to deal with the returned xml, keeping the resulting xml for each line/block in memory until the entire editor text has been parsed would still probably run out of memory
I cannot imagine that this is just to do with QPlainTextEdit widget, but in general when there is a large amount of "code"/text say 1M, or even 10M lines of "code", in a single file how would one go about reading and parsing all 10M lines?
for my example I'm using python 2.7 on Windows with pyqt4.8