-1

I am parsing my database logs using fluentd where I am parsing the logs using regular expression and indexing them to solr, I have a problem where few logs are in the below format

2023-08-24 08:28:20.278 UTC [26154] abcd@rst STATEMENT:SELECT 
           purpose,
            COUNT(*) AS std_count
        FROM
            emp.database
            LEFT JOIN LATERAL (
                SELECT
                    unnest(string_to_array(db_platform, ',')) AS platform
                UNION ALL
                SELECT
                    'Not Specified'
                WHERE
                    db_platform IS NULL
                    OR db_platform = ''
            ) AS platform_data ON true
            LEFT OUTER JOIN std.server ON std.database.std_id = emp.server.emp_id
            LEFT OUTER JOIN emp.database ON std.server std_id = emp_id
        WHERE
            1 = 1
            AND std.server_customer IN ('EMPX')
            AND std.server.business IN ('IT Services')
            AND std.server.department IN ('Product Development')
        GROUP BY
            1
        ORDER BY
            1
        unnest(string_to_array(s.purpose, ',')) AS platform
                UNION ALL
                SELECT
                    'Not Specified'
                WHERE
                    s.purpose IS NULL
                    OR s.purpose = ''
            ) AS platform_data ON true
 

and few logs are in below format

2023-08-24 08:32:36.963 UTC [14006] abc@vx LOG:  could not receive data from client: Connection reset by peer
2023-08-24 08:32:36.963 UTC [3403] xyz@mdp LOG:  ould not receive data from client: Connection reset by peer

while using the below regular expression I am able to parse the logs like below,

**expression**:

    (?<time_stamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3})[\s]
     UTC[\s][\[](?<qid>[^ ]*)[\]][\s]
    (?<user>[^ ]*)[\@](?<schema>[^ ]*)[\s](?<level>[^:]*):(?<message>.*)

**Match 1**
time_stamp :2023-08-24 08:28:20.278
qid :26154
user:   writeuser
schema: abc
level:  STATEMENT
message:    SELECT

**Match 2**
time_stamp: 2023-08-24 08:32:20.586
qid :13788
user:   abc
schema: vx
level:  LOG
message:    could not receive data from client: Connection reset by peer

**Match 3**
time_stamp: 2023-08-24 08:32:20.586
qid:    13779
user:   abc
schema: vx
level:  LOG
message:    could not receive data from client: Connection reset by peer

As we can see that the 1st log is only showing message as SELECT , any chance that I can capture the complete message.

Ram
  • 21
  • 5

0 Answers0