i have a large csv file to edit (tens of thousands lines) , the task is to edit it 'friendly' anf fast, so i would like to use PyQt QTableView to hold it, there are also some filter required, basically i need to do some "update price = 200 where name = 'Jack'" operation,
i've came up with few options but wonder if we could just combine the advantages
update: the code was on workstation and could not touch internet, i'll write psudo code instead, sorry
1, import into local sqlite
it's more than a minute to the disk and less than 10 seconds to memory, with a QSqlTableModel, it's relative fast, and adding a filter is easy, the output are not totally displayed until it's scrolled, which is good
# use pyqt and sqlite
fh = open("path/to/csv/file")
query = QSqlQuery()
query.prepare("""insert into table""")
for line in fh:
id, name, price = line.split(",")
query.bindValue()
query.exec()
self.view = QTableView()
self.model = QSqlTableModel()
self.model.setTable(table)
self.view.setModel(self.model)
# some line edit gather pattern input
self.model.setFilter("id = 1003")
2, just parse csv into model and display, i referenced pyqt - populating QTableWidget with csv data with the answer of user1006989
just put cell into QStandardItem, the Model/View works fine, but load large files is too slow ( about 20 seconds here ), and don't know how to implement a filter ( if we just skip when load into model, how can i write data back )
3, command line replace
I've implement both option 1 and 2, it's not very quick but might be acceptable, here I wonder if compose a Perl-like regex replacement could help, ( we need to see it's original value first )
proposed work flow is
regex search ==A> display / populate in table / model ==B> accept edit ==C> prepare a regex replace
gather pattern with pyqt line edit
grep $pattern large_csv > small_csv
cp small_csv small_csv.bk
populdate small_csv into model and display in table widget
commit change
diff small_csv small_csv.bk and prepare a
perl -ie 's/old_line/new_line' large_csv
which came up with a solution that i do not have to load the full content into database or populate into widget, this should be even faster, while still equiped with a filter, but i got some problem on ==C>
if the filter gots 4 record and i edit one of them, just backup the filter result and then diff, and then prepare a whole line prelacement ? ( each line is unique, there are some primary key things inside and won't change )
Hope some one could review my thoughts or give some advice,
Thanks a lot for your time :)