1

I am working on Elastic Stack with Mysql. everything is working fine like logstash taking data from mysql database and sending it to elasticsearch and when new entries entered in mysql data then to update elasticsearch automatically i am using parameter: Schedule but in this case logstash is checking continuously for new data from it's terminal that is my main concern.

input {

  jdbc { 
    jdbc_connection_string => "jdbc:mysql://localhost:3306/testdb"
    # The user we wish to execute our statement as
    jdbc_user => "root"
    jdbc_password => ""
    # The path to our downloaded jdbc driver
    jdbc_driver_library => "/home/Downloads/mysql-connector-java-5.1.38.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    #run logstash at an interval of on minute
    schedule => "*/15 * * * *"
    use_column_value => true
    tracking_column => 'EVENT_TIME_OCCURRENCE_FIELD'
    # our query
    statement => "SELECT * FROM brainplay WHERE EVENT_TIME_OCCURRENCE_FIELD > :sql_last_value"
    }

  }
output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "test-migrate"
  "document_type" => "data"
  "document_id"   => "%{personid}"
  }
}

But if data is large Logstash will check for new entries in entire data without any stopping point then this will reduce scalability and consume more power.

Is there any other method or any webhook like when new data is entered into database then mysql will notify Logstash only for new data or Logstash will check for only new entries, Please help

ankitkhandelwal185
  • 1,023
  • 1
  • 15
  • 24

1 Answers1

0

You can either use sql_last_start parameter in your query with any timestamp field (assuming that there is a timestamp field like last_updated).

For example, your query could be like,

WHERE last_updated >= :sql_last_start

From this answer,

For example, the first time you run this sql_last_start will be 1970-01-01 00:00:00 and you'll get all rows. The second run sql_last_start will be (for example) 2015-12-03 10:55:00 and the query will return all rows with a timestamp newer than that.

or you can read this answer on using :sql_last_value

WHERE last_updated > :sql_last_value
Sufiyan Ghori
  • 18,164
  • 14
  • 82
  • 110