I am having a problem regarding a Log Searching Speed and Disk Size. It is extremely big, it has about 220 millions rows and 25 gigabyte disk size and takes several minutes to fetch some selects.
How does it work? The log is saved in the database using Sql Anywhere, currently version 9 and soon will be migrated to 11 (we tried to 12, but due some driver and some problems, we went back to 11).
The log consists with two tables (name changed to english so the people here are able to understand):
LogTable
Id, DateTime, User, Url, Action and TableName. Action is what the used did: insert/delete/update TableName is which table in the database was affected.
LogTableFields
Id, LogTable_Id, FieldName, NewValue, OldValue. LogTable_Id is foreign key from LogTable. FieldName is the field of the table from DB.
Important to note that NewValue and OldValue are type of varchar. Because it's recorded every kind of fields from other tables (datetime, int, etc).
Why it was made this way? Because we must record everything important. The system is made to an Institutional Department of Traffic (i don't know if it's spelled this way in proper english, but now you can an ideia what this is about) and sometimes they demand some kind of random report.
Until now, we have made our report simply doing some SQL select. However it takes several minutes to complete, even if datetime filtered. Isn't and issue to complain when it's not request often.
But they are demanding more and more reports that it is necessary to create a feature in the software with a nice and beauty report. As we never know theirs needs, we must go back to log and unbury the data.
Some information requested are only in the log. (e.g what user gave improperly access of the vehicle to someone)
Some ideas suggested until now:
Idea 1: I did some researches and I was told to work with NoSql using CouchDB. But the little i read i feel NoSql isn't a solution for my problem. I can't argue why for non experience in it.
Idea 2: Separate the Log Tables physically from the Database or from the machine.
Idea 3: Create a mirror from every table with a version field to keep history.
I'd like a macro optimization or architecture change if needed.