0

Will using Monolog remove the potential bottleneck as compared to logging with file_put_contents with LOCK_EX flag?

I have the following public-facing PHP script, which gets executed about 10x/s on average, and 100x/s in peak times. This resulted in a bottleneck since scripts have to wait for each other due to LOCK_EX.

<?php
    // ... do some things
    // then log the result:
    file_put_contents("/home/logs/public.log", "\n" . date("d.m. G:i:s") . " (some log) $httpcode $retry_count", FILE_APPEND | LOCK_EX);
?>

I am considering rewriting it to use Monolog like this:

<?php
    require_once(DIR.'/vendor/autoload.php');
    use MonologLogger;
    use MonologHandlerStreamHandler;
    
    $logger = new Logger('public-script');
    $logger->pushHandler(new StreamHandler('/home/logs/public.log', Logger::INFO));

    // ... do some things
    // then log the result:
    $logger->info("(some log) $httpcode $retry_count");
?>

--

Based on Koala Yeung's answer, using Monolog will not help with the bottleneck. I need to find something else which is capable of writing to file without blocking the script's execution.

c0dehunter
  • 6,412
  • 16
  • 77
  • 139

1 Answers1

1

You may simply write a script to test yourself:

main.php:

<?php

use Monolog\Logger;
use Monolog\Handler\StreamHandler;

require __DIR__ . '/vendor/autoload.php';

$logger = new Logger('public-script');
$logger->pushHandler(new StreamHandler(__DIR__ . '/public.log', Logger::INFO));

$options = getopt("n:l:s:");

$name = $options['n'] ?? 'pid:' . getmypid();
$limit = (int) ($options['l'] ?? 1000);
$sleep = (int) ($options['s'] ?? 1000);

// ... do some things
// then log the result:
for ($i=1; $i<=$limit; $i++) {
        $logger->info("workering", [
                'worker' => $name,
                'count' => $i,
        ]);
        usleep($sleep);
}

run it:

php main.php -n 'worker 1' &; php main.php -n 'worker 2' &

Or run it with smaller interval (10 micro second per log):

php main.php -s 10 -n 'worker 1' &; php main.php -s 10 -n 'worker 2' &

The results:

[2021-02-09T19:37:51.649015+08:00] public-script.INFO: workering {"worker":"worker 1","count":1} []
[2021-02-09T19:37:51.650283+08:00] public-script.INFO: workering {"worker":"worker 1","count":2} []
[2021-02-09T19:37:51.650370+08:00] public-script.INFO: workering {"worker":"worker 1","count":3} []
[2021-02-09T19:37:51.650447+08:00] public-script.INFO: workering {"worker":"worker 1","count":4} []
[2021-02-09T19:37:51.649321+08:00] public-script.INFO: workering {"worker":"worker 2","count":1} []
[2021-02-09T19:37:51.650522+08:00] public-script.INFO: workering {"worker":"worker 1","count":5} []
[2021-02-09T19:37:51.650555+08:00] public-script.INFO: workering {"worker":"worker 2","count":2} []
[2021-02-09T19:37:51.650596+08:00] public-script.INFO: workering {"worker":"worker 1","count":6} []
[2021-02-09T19:37:51.650640+08:00] public-script.INFO: workering {"worker":"worker 2","count":3} []
[2021-02-09T19:37:51.650670+08:00] public-script.INFO: workering {"worker":"worker 1","count":7} []
[2021-02-09T19:37:51.650716+08:00] public-script.INFO: workering {"worker":"worker 2","count":4} []
[2021-02-09T19:37:51.650743+08:00] public-script.INFO: workering {"worker":"worker 1","count":8} []
[2021-02-09T19:37:51.650789+08:00] public-script.INFO: workering {"worker":"worker 2","count":5} []
[2021-02-09T19:37:51.650815+08:00] public-script.INFO: workering {"worker":"worker 1","count":9} []
[2021-02-09T19:37:51.650861+08:00] public-script.INFO: workering {"worker":"worker 2","count":6} []
[2021-02-09T19:37:51.650886+08:00] public-script.INFO: workering {"worker":"worker 1","count":10} []
[2021-02-09T19:37:51.650911+08:00] public-script.INFO: workering {"worker":"worker 2","count":7} []
[2021-02-09T19:37:51.650958+08:00] public-script.INFO: workering {"worker":"worker 1","count":11} []
[2021-02-09T19:37:51.650985+08:00] public-script.INFO: workering {"worker":"worker 2","count":8} []
[2021-02-09T19:37:51.651031+08:00] public-script.INFO: workering {"worker":"worker 1","count":12} []
[2021-02-09T19:37:51.651057+08:00] public-script.INFO: workering {"worker":"worker 2","count":9} []
[2021-02-09T19:37:51.651103+08:00] public-script.INFO: workering {"worker":"worker 1","count":13} []
...

The logs interweave quite well.

If you read the source code of StreamHandler::write(), you can see that it is using the same exclusive locking (LOCK_EX) mechanism as file_put_contents.

Koala Yeung
  • 7,475
  • 3
  • 30
  • 50
  • So basically, there will be no performance gain if I started using Monolog instead of simply `file_put_contents` with `LOCK_EX` flag; is that correct? – c0dehunter Feb 09 '21 at 12:13
  • By having all the abstraction and feature provided by Monolog, I believe there is only performance lost. But having a better abstraction for logging give you flexibility and functionality in the future (e.g. switch from file logging to database logging, easily switching log level). If the performance lost is not significant to your use, it's preferable to use higher level library than straight `file_put_contents()`. – Koala Yeung Feb 09 '21 at 12:20
  • Thank you for your input. It seems I need to continue my quest to find something that is able to write asynchronously to the same file without blocking the script's execution. – c0dehunter Feb 09 '21 at 12:21
  • 1
    @PrimožKralj: Why not logging to some other non-blocking data storages? If you choose to log to files, there will always be some locking mechanism to make sure things got through without overwriting each other. With Monolog, you can implement your own handler for your source. – Koala Yeung Feb 09 '21 at 12:27
  • Could you recommend me some non-blocking data storage that is trivial to use? – c0dehunter Feb 09 '21 at 12:32
  • 1
    You can either send it to RabbitMQ and have workers to write the log. The queue service is non-blocking. Another option I can think of would be non-blocking database like [Apache Cassandra](https://cassandra.apache.org/). – Koala Yeung Feb 09 '21 at 12:48
  • I know I'm late to the party, but I'd just like to clarify something in case others follow: Monologs StreamHandler does NOT use locking by default. – DennisK Dec 01 '22 at 13:55