0

On Examination of BinLog Found Following Statement : It has All table columns in where clause. The query I passed to mysql had only 1 Column- Primary Key Column, then why the log wrote all columns in BinLog? I have a particular problem with BinLog doing this : This is causing Recovery from BinLog to be very very Slow. I am not a MySQL DBA. Please Share Your Experience On This Subject.

UPDATE `ctbmysql`.`ctm`
### WHERE
###   @1=1139195549890498825
###   @2=1138051383057521436
###   @3=1615397172
###   @4=''
###   @5=1130000000662993985
###   @6=113
###   @7=''
###   @8=''
###   @9=1635370236
###   @10='49.128.173.183'
###   @11='49.128.173.146'
###   @12=''
###   @13=''
###   @14=''
###   @15=0
###   @16=1
###   @17=''
###   @18=''
###   @19=''
###   @20=0
###   @21='google-play'
###   @22=4217121370623809752
###   @23=''
###   @24=1
###   @25=''
###   @26=''
###   @27=0
### SET
###   @1=1139195549890498825
###   @2=1138051383057521436
###   @3=1615397172
###   @4=''
###   @5=1130000000662993985
###   @6=113
###   @7=''
###   @8=''
###   @9=1635453015
###   @10='150.242.24.246'
###   @11='49.128.173.146'
###   @12=''
###   @13=''
###   @14=''
###   @15=0
###   @16=1
###   @17=''
###   @18=''
###   @19=''
###   @20=0
###   @21=''
###   @22=4217121370623809752
###   @23=''
###   @24=1
###   @25=''
###   @26=''
###   @27=0

2 Answers2

1

MySQL supports two different binary log formats as well as a mixture of both:

  • STATEMENT causes logging to be statement based.
  • ROW causes logging to be row based. This is the default.
  • MIXED causes logging to use mixed format.

You can choose one by setting the binlog_format configuration option.

What you are seeing in your log is the row format. It lists the row to update (with all its values). This is not supposed to be your original query, it just looks similar. E.g. if your update affected more than one row, it would create more than one row-updates for your single original update query.

This is the intended and expected behaviour. From Advantages and Disadvantages of Statement-Based and Row-Based Replication:

RBR can generate more data that must be logged. To replicate a DML statement (such as an UPDATE or DELETE statement), statement-based replication writes only the statement to the binary log. By contrast, row-based replication writes each changed row to the binary log. If the statement changes many rows, row-based replication may write significantly more data to the binary log; this is true even for statements that are rolled back. This also means that making and restoring a backup can require more time.

Nevertheless, row bases replication is the default for a reason:

Advantages of row-based replication:

  • All changes can be replicated. This is the safest form of replication.
Solarflare
  • 10,721
  • 2
  • 18
  • 35
  • I changed bin log format to MIXED. Now when restoring from binlogs it flies for some time (for upto 380 MB about 1 MB per second) and then after some time it seems like it is getting exponentially slow (only a few KB per second). I am feeding mysqlbinlog.exe output to mysql.exe on my Resore server. Why would it get slow after some time – user1070759 Nov 03 '21 at 10:39
  • The processing speed in bytes/s of your log files (if that is what you mean) is probably not a good measure. A simple "delete from tablename" uses 21 bytes and can take hours depending on your data. And you can definitely not compare that to the throughput in row format: there you use a lot of bytes for extremely fast queries - e.g. your sample uses around 0,4kb to update a single row via the primary key, which is extremely fast, you can easily do 10.000 updates like that per second. But you may *need* a lot of those fast row format queries to make up for one "slow" statement format query. – Solarflare Nov 03 '21 at 11:28
  • Thank you for some insights. This really leaves me wondering. If the Delete From statement did not take hours on prod then why it would take hours when re executed from BinLog .. My expectation is that all statements should take similar amount of time to execute that they took on Production. If not then how can we achieve that. That will solve my problem of running BinLog resote in fastest possible time. – user1070759 Nov 04 '21 at 10:34
  • That's not what I meant; you asked (I think) about execution speed in MB/s of the log file, which has nothing to do with how fast your queries ran on production. Generally, yes, statement based replication should take a similar time. If it doesn't, this would be different question though, as it has almost nothing to do with your original question about why your log lists all the columns, so you should make it a new question (on https://dba.stackexchange.com/ probably) with details about the situation, e.g. which queries are replicating way to slow, hardware details (ssd vs hdd, ram), .... – Solarflare Nov 04 '21 at 11:07
  • yes that is best place to re frame and post. Thank you ! – user1070759 Nov 04 '21 at 11:44
0

parameter binlog_row_image may does help

Alan
  • 1
  • While this code snippet may be the solution, including a detailed explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. – Shawn Hemelstrand Mar 12 '23 at 11:21