I am using Symfony 3.1 and I try to configure Monolog in such a way, that requests from the GoogleBot are not logged...
The quick way to prevent robots visiting your site is put these two lines into the /robots.txt
file on your server. Create a robots.txt
file in 'web' directory and paste the follows content:
User-agent: *
Disallow: /
https://support.google.com/webmasters/answer/6062608?hl=en&visit_id=1-636097099675465769-3677253464&rd=1
It's the recommended option when you need avoid access fully, meaning your sites will not longer be index by search engines and other bots. You don't need to configure/implement anything in your application to achieve it.
Now, if you need the bot to enter, but you don't want register it in logs. Instead of writing log files somewhere, some handlers are used to filter or modify log entries before sending them to other handlers. One powerful, built-in handler called fingers_crossed
is used in the prod
environment by default. It stores all log messages during a request but only passes them to a second handler if one of the messages reaches an action_level
:
# app/config/config.yml
monolog:
handlers:
filter_for_errors:
type: fingers_crossed
# if *one* log is error or higher, pass *all* to file_log
action_level: error
handler: file_log
# now passed *all* logs, but only if one log is error or higher
file_log:
type: stream
path: "%kernel.logs_dir%/%kernel.environment%.log"
Thus, in your prod.log
file just will register the messages/requests that contains some error, so the bots don't have effect in this level.
More details about this http://symfony.com/doc/current/logging.html
What you try to do is not advisable, because the handler will depend from http request instead of log records, which will be out of context, however you can register its own handler in Symfony easily:
Let's create the custom handler class:
namespace AppBundle\Monolog\Handler;
use Monolog\Handler\AbstractHandler;
class StopBotLogHandler extends AbstractHandler
{
public function isBotRequestDetected()
{
// here your code to detect Bot requests, return true or false
// something like this:
// return isset($_SERVER['HTTP_USER_AGENT']) && preg_match('/bot|crawl|slurp|spider/i', $_SERVER['HTTP_USER_AGENT']);
}
/**
* Checks whether the given record will be handled by this handler.
*
* This is mostly done for performance reasons, to avoid calling processors for nothing.
*
* Handlers should still check the record levels within handle(), returning false in isHandling()
* is no guarantee that handle() will not be called, and isHandling() might not be called
* for a given record.
*
* @param array $record Partial log record containing only a level key (e.g: array('level' => 100) for DEBUG level)
*
* @return bool
*/
public function isHandling(array $record)
{
return $this->isBotRequestDetected();
}
/**
* Handles a record.
*
* All records may be passed to this method, and the handler should discard
* those that it does not want to handle.
*
* The return value of this function controls the bubbling process of the handler stack.
* Unless the bubbling is interrupted (by returning true), the Logger class will keep on
* calling further handlers in the stack with a given log record.
*
* @param array $record The record to handle
*
* @return bool true means that this handler handled the record, and that bubbling is not permitted.
* false means the record was either not processed or that this handler allows bubbling.
*/
public function handle(array $record)
{
// do nothing, just returns true whether the request is detected as "bot", this will break the handlers loop.
// else returns false and other handler will handle the record.
return $this->isBotRequestDetected();
}
}
Whenever you add a record to the logger, it traverses the handler stack. Each handler decides whether it fully handled the record, and if so, the propagation of the record ends there.
Important: Read the phpdoc from isHandling()
and handle()
methods for more details.
Next, let's register the class as service "without tags":
# app/config/services.yml
services:
monolog.handler.stop_bot_log:
class: AppBundle\Monolog\Handler\StopBotLogHandler
public: false
Then, add its handler to handlers
list:
# app/config/config_prod.yml
monolog:
handlers:
# ...
stopbotlog:
type: service
id: monolog.handler.stop_bot_log
priority: 1
Note the type
property must be equal to service
, id
must be the service name before defined and priority
must be greater than 0
to ensure that its handler will be executed before that any other handler.
When the GoogleBot performs a request to website application the stopbotlog
handler stops all handlers after him and don't register any log message.
Remember it's not the recommended way to do that! According to your needs, implementing option 1 or 2 should be enough.
If you want ignore bot requests for handlers group, you can override the monolog.handler.group.class
container parameter and override the group handler
behavior:
namespace AppBundle\Handler;
use Monolog\Handler\GroupHandler;
class NoBotGroupHandler extends GroupHandler
{
public function isBotRequestDetected()
{
// here your code to detect Bot requests, return true or false
}
public function handle(array $record)
{
if ($this->isBotRequestDetected()) {
// ignore bot request for handlers list
return false === $this->bubble;
}
return parent::handle($record);
}
}
in your config_prod.yml
or services.yml
:
parameters:
monolog.handler.group.class: AppBundle\Handler\NoBotGroupHandler
That's it! Now, you can stop bot logs for custom handles list:
# config_prod.yml
monolog:
handlers:
grouped:
type: group
members: [main, console, chromephp]
Finally, if you have difficulty to analyze your logs files I recommend using this amazing tool: https://github.com/EasyCorp/easy-log-handler