47

I have few scripts loaded by cron quite often. Right now I don't store any logs, so if any script fails to load, I won't know it till I see results - and even when I notice that results are not correct, I can't do anything since I don't know which script failed.

I've decided to store logs, but I am still not sure how to do it. So, my question is - what's more efficient - storing logs in sql database or files?

I can create 'logs' table in my mysql database and store each log in separate row, or I can just use php's file_put_contents or fopen/fwrite to store logs in separate files.

My scripts would approximately add 5 logs (in total) per minute while working. I've done few tests to determine what's faster - fopen/fwrite or mysql's insert. I looped an "insert" statement 3000 times to make 3000 rows and looped fopen/fwrite 3000 times to make 3000 files with sample text. Fwrite executed 4-5 times faster than sql's insert. I made a second loop - I looped a 'select' statement and assigned it to a string 3000 times - I also opened 3000 files using 'fopen' and assigned the results to the string. Result was the same - fopen/fwrite finished the task 4-5 times faster.

So, to all experienced programmers - what's your experience with storing logs? Any advice?

// 04.09.2011 EDIT - Thank you all for your answers, they helped ma a lot. Each post were valuable, so it was quite hard to accept only one answer ;-)

biphobe
  • 4,525
  • 3
  • 35
  • 46
  • @the slowness is in the overhead of the `insert` statement. If you where to add the data to a CSV file and read that in using `load data infile`, the 4-5 times would quickly melt to 2 times, 1x for writing the CSV file, 1x for load data infile. – Johan Aug 31 '11 at 13:13
  • 1
    @firian - You just need to trigger the script to sent an email (contains the details) to you when there is a problem – ajreal Aug 31 '11 at 13:30
  • You can use a cache database like redis or memcache and a process for keep all this in a mysql. Too you can use MongoDB directly or using redis bridge. Redis it's really fast, mongo its most slow. MySQL its really really slow xD. Too you can use some external log service like loggly.com – user1710825 Jan 04 '15 at 05:18

9 Answers9

20

Logs using files are more efficient, however logs stored in the database are easier to read, even remotely (you can write a web frontend if required, for example).

Note however that connecting and inserting rows into the database is error prone (database server down, password wrong, out-of-resources) so where would you log those errors if you decided to use the database?

trojanfoe
  • 120,358
  • 21
  • 212
  • 242
  • 6
    Boss would send a CRITICAL log message via email. – Nick Aug 29 '16 at 19:21
  • 1
    log to file as backup maybe in json so it's easy to parse when server is back up, and send an email the critical errors – over_optimistic Aug 25 '18 at 02:34
  • 2
    You could log to db, but if there is an error logging to db you log to file instead, logging both the original error that should have gone to db and the error experienced when trying to store to the log db – Daniel Valland May 16 '20 at 14:28
18

You can use a component such as Zend_Log which natively supports the concept of writers attached to the same log instance. In that way you can log the same message to one or more different place with no need to change your logging code. And you can always change your code to replace the log system or add a new one in a simple way.

For your question I think that log to files is simpler and more appropriate if you (developer) is the only one who needs to read log messages.

Log to db instead if you need other people needs to read logs in a web interface or if you need the ability to search through logs. As someone else has pointed out also concurrency matters, if you have a lot of users log to db could scale better.

Finally, a log frequency of 5 messages per minute requires almost no cpu for your application, so you don't need to worry about performances. In your case I'd start with logfiles and then change (or add more writers) if your requisites will change.

Fabio
  • 18,856
  • 9
  • 82
  • 114
  • 3
    the more complex tool you're using, the less stable it is. Keep it simple and use straight log files. – Your Common Sense Sep 04 '11 at 12:04
  • @Col.Shrapnel the more complex tool you're using, the more flexible it is. Moreover, Zend_Framework has more than three years and it's well tested, it should be pretty solid, don't you think? – Fabio Sep 04 '11 at 21:13
  • 1
    flexibility is not among most important features of the logging. but fault tolerance - is. In case you want your own logger with blackjack and hookers - please, do *post* processing, use whatever log analyzer which will be happy to put your logs into database, lunar base or whatever. – Your Common Sense Sep 05 '11 at 04:48
  • Zend_Log link is not accessible anymore. – jawo Mar 16 '16 at 13:10
7

Commenting on your findings.

Regarding the writing to the file you are probably right.
Regarding the reading you are dead wrong.

Writing to a database:

  1. MyISAM locks the whole table on inserts, causing a lock contention. Use InnoDB, which has row locking.
  2. Contrary to 1. If you want to do fulltext searches on the log. Use MyISAM, it supports fulltext indexes.
  3. If you want to be really fast you can use the memory engine, this writes the table in RAM. Transfer the data to a disk-based table when CPU load is low.

Reading from the database

This is where the database truly shines.
You can combine all sorts of information from different entries, much much faster and easier than you can ever do from a flat file.

SELECT logdate, username, action FROM log WHERE userid = '1' /*root*/ AND error = 10;

If you have indexes on the fields used in the where clause the result will return almost instantly, try doing that on a flat file.

SELECT username, count(*) as error_count 
FROM log 
WHERE error <> 0 
GROUP BY user_id WITH ROLLUP

Never mind the fact that the table is not normalized, this will be much much slower and harder to do with a flat file.
It's a no brainer really.

Johan
  • 74,508
  • 24
  • 191
  • 319
5

Speed isn't everything. Yes, it's faster to write to files but it's far faster for you to find what you need in the logs if they are in a database. Several years ago I converted our CMS from a file-based log to a Mysql table. Table is better.

Charlie
  • 1,062
  • 6
  • 9
3

It depends on the size of the logs and on the concurrency level. Because of the latest, your test is completely invalid - if there are 100 users on the site, and you have let's say 10 threads writing to the same file, fwrite won't be so faster. One of the things RDBMS provides is concurrency control.

It depends on the requirements and lot kind of analysis you want to perform. Just reading records is easy, but what about aggregating some data over a defined period?

Large scale web sites use systems like Scribe for writing their logs.

If you are talking about 5 records per minute however, this is really low load, so the main question is how you are going to read them. If a file is suitable for your needs, go with the file. Generally, append-only writes (usual for logs) are really fast.

Maxim Krizhanovsky
  • 26,265
  • 5
  • 59
  • 89
2

I think storing logs in database is not a good idea. The pros of storing logs to databases over files is that you can analyse your logs much more easily with the power of SQL, the cons, however, is that you have to pay much more time for database maintainence. You'd better to set up a seperate database server to store your logs or your might get too much log INSERT which will decrease your database performance to production use; also, it's not easy to migrate, archive logs in database, compared with files(logrotate, etc).

Nowadays you should use some special feature-rich logging system to handling your logs, for example, logstash(http://logstash.net/) has log collector, filter, and it can store log in external systems such as elasticsearch, combined with a beautiful frontend for visualizing and analyzing your logs.

Ref:

Xiao Hanyu
  • 1,402
  • 16
  • 11
  • oh there is an error in the application, oh all errors are getting logged into the applications database, oh now there are even more errors and then more data in the database, ups there is recursion when the error comes, wow, stacktraces can become really big while logging the errors, oh the application database does not have much space any longer, oh there are more errors even logged that database space is low, oh space of the database is gone, oh the application is gone - let's check the logs whats going on we have them in the database for easy access .... – hakre Jul 13 '21 at 16:45
1

Writing the filesystem should always be faster.

That however shouldent be your concern. Both doing a simple insert and writing to a file system are quick operations. What you need to be worried about is what happens when your database goes down. I personaly like to write to both so there is always a log if anything goes wrong but also you have the ease of searching from a database.

Tom Squires
  • 8,848
  • 12
  • 46
  • 72
  • citation needed. Methinks writing to a file caused a full file lock on every write, if you database (engine) support row locking the DB can be much faster. – Johan Aug 31 '11 at 13:12
  • 1
    @Johan there are no rows in the files. and database keeps data in nowhere but files – Your Common Sense Aug 31 '11 at 13:13
  • @Col, a database can use a few tricks to make things faster, you can use a memory engine, or a partitioned table spread across different disks to make inserts faster than you can do in a flat file. My point is that the filesystem is **not always** faster. – Johan Aug 31 '11 at 13:30
  • @Johan I would expect in most cases transfering the data over a network would take an order of magnitude longer than sending it to a file. I must admit i havent bench tested it though. – Tom Squires Aug 31 '11 at 13:43
-1

Error logging is best limited to files in my opinion, because if there is a problem with the database, you can still log that. Obviously that's not an option if your error logging requires a connection to the database!

What I will also say however, is that general logging is something I leave within the database, however this only applies if you are doing lots of logging for audit trails etc.

gamesmad
  • 399
  • 1
  • 2
  • 14
-2

Personally, I prefer log files so I've created two functions:

<?php
function logMessage($message=null, $filename=null)
{
    if (!is_null($filename))
    {
        $logMsg=date('Y/m/d H:i:s').": $message\n";
        error_log($logMsg, 3, $filename);
    }
}

function logError($message=null, $filename=null)
{
    if (!is_null($message))
    {
        logMessage("***ERROR*** {$message}", $filename);
    }
}
?>

I define a constant or two (I use ACTIVITY_LOG and ERROR_LOG both set to the same file so you don't need to refer to two files side by side to get an overall view of the running) and call as appropriate. I've also created a dedicated folder (/var/log/phplogs) and each application that I write has its own log file. Finally, I rotate logs so that I have some history to refer back to for customers.

Liberal use of the above functions means that I can trace the execution of apps fairly easily.

DaveyBoy
  • 2,928
  • 2
  • 17
  • 27
  • Because I run multiple applications simultaneously and I don't want to trawl through a single (potentially **large**) log file looking for errors for a particular application. And before anyone says "A lot of errors? Sloppy coding!", I access a lot of external services which have the capability to fail so I need to log that too – DaveyBoy Aug 31 '11 at 14:09
  • so, you're gathering PHP native errors into single (potentially large) log file, but manual ones going into smaller logs. Strange setup, if you ask me – Your Common Sense Aug 31 '11 at 14:11
  • I'm more concerned about errors calling other services than errors caused by PHP native ones. I run approximately 50 apps per hour and finding errors relating to a single one of these within a single, large file would take too long - separate files makes this easier as I can copy/paste errors into an email and contact the relevant service support for their assistance. They can then check their systems, trace through the calls and find out where the error occurred on their side. If it's my fault, I admit it but the most errors I get are from calling remote services or retrieving files – DaveyBoy Aug 31 '11 at 14:17
  • I am just wondering why don't you set up your destination file once, without bothering with these custom functions nor with setting a filename always manually. I am also wondering why double timestamp doesn't bother you... – Your Common Sense Aug 31 '11 at 14:35
  • There is no double timestamp - logError simply prepends the passed message with "ERROR" and passes it to the logMessage function. The main thing is that it works for me and considering I'm the sole sysadmin, developer, dba and support for the platform, that's more important than having a single file logging everything which I have to search through to get details about a single error. The oroginal poster was asking a question and I responded with my thoughts and supplied a couple of functions. Whether he takes heed of them is entirely up to them. Personal choice is what it's all about – DaveyBoy Sep 01 '11 at 11:48
  • nobody told you to have single file. I am just wondering why don't you set up certain log file per application, using ini_set. And my error_log() puts timestamp automatically. Strange – Your Common Sense Sep 01 '11 at 11:51
  • The code base I inherited is huge. Going through every app to add in your suggestion would take ages and I don't have the time or the patience. As I said, it works for me. If you do something different, good for you. – DaveyBoy Sep 01 '11 at 11:58
  • but you already was going through every app to add $filename manually? – Your Common Sense Sep 01 '11 at 12:28
  • Nope. I never said that at all. I suggest you re-read what I've written. Also, I do this thing called "testing" to find all the PHP native errors before releasing to live. However, "testing" cannot take into account run-time errors caused by outside influences. I want to log these errors so that I can refer to them when contacting remote parties. And that, ladies and gentlemen, is my last comment on the subject. To the original poster, I'm sorry that this has hijacked your question. The general answer is "Do what suits you and don't worry about doing it wrong - change it later on if needed" – DaveyBoy Sep 01 '11 at 12:48
  • These comments do not affect opening question in any way :) think of your setup - it can be simplified dramatically – Your Common Sense Sep 01 '11 at 13:30