1

We have a read-only replica of our production database using Availability Groups (SQL Server 2017). The requirement was to set the recovery model to full for both.

Is there a reason why this is a prerequisite? I get that the primary database should have a full recovery mode for the changes to be replicated, but why does the read-only replica of it require it?

I did some research and I saw that Database Mirroring requires Full Recovery Model as well.

The problem that we have is that we have some queries in millions of rows that fill up the tempDB and our DBAs complain. I thought that switching the read-only replica to a simple recovery model (since we are not going to need or use restore point in time on the replica instance) would resolve the issue but apparently this is impossible.

Could someone explain the requirement or suggest a workaround?

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
Panos
  • 55
  • 7
  • This question mixes up too many incompatible things. AGs aren't *mirroring*, they replaced mirroring. The things communicated between nodes are the transaction log records. You *don't* need a read-only replica anyway, if you intend to use it for reporting or creating a data mart. AGs are for failover wih read-only replicas a nice addition – Panagiotis Kanavos Feb 21 '19 at 13:59
  • What are you trying to do? What is the *actual* problem that you want to solve with a read-only replica? You could use replication for example to copy data to another node *without* using AGs. You could use change tracking (not CDC) to copy changes from one database to another periodically. You can use a properly designed reporting database to generate reports *orders of magnitude faster* than a read-only replica. You could use OLAP to precalculate and update aggregations automatically, Finally, you can use Columnstore indexes even in Express to compress data and speed up aggregation queries – Panagiotis Kanavos Feb 21 '19 at 14:02
  • `queries in millions of rows that fill up the tempDB and our DBAs complain.` they should. Millions of rows is tiny data. The current no-cluster TPC-H benchmarks measure performance in the 1-30TB range. MIllions of rows means there are bad or unindexed queries. That's the quries' fault, not mirroring's. Mirroring doesn't use tempdb all that much. What do the execution plans look like? Did you check query store? – Panagiotis Kanavos Feb 21 '19 at 14:04
  • As I understand it, the replicas are effectively in a very similar state to a database that's continually being left in the `NORECOVERY` state during an (ongoing) set of `RESTORE` actions (being applied by the AG/Mirroring mechanisms). The only difference being that you're allowed to access their data (and they never see incomplete transactions). – Damien_The_Unbeliever Feb 21 '19 at 14:18
  • Simple recovery on your reporting database will reduce the size of the log and the simplest solution for you would probably be transactional replication. However, this will not prevent tempdb from being used or from filling up during query execution. – Ross Bush Feb 21 '19 at 14:19
  • @PanagiotisKanavos Yes, AGs are not mirroring per se, but having a read-only replica has similar result in practice as mirroring, that's why I compared the two. The queries are pretty simple with a couple of joins on indexed columns. The reason for the replica was to decrease the stress on the actual Prod server and we wanted it to be as close to real-time as possible. There are certainly other solutions and architectures worth exploring, but I was just curious as to why the Full Recovery model is needed when the AG participant is set as read-only – Panos Feb 21 '19 at 16:59
  • @Panos youi'll get much closer to real time if you use eg an in-memory table with a columnstore index on top. No matter how you copy the data, it's the *schema* and indexes that determines how fast queries run. In any case, if there's tempdb pressure, it means the schema isn't appropriate for the queries. – Panagiotis Kanavos Feb 21 '19 at 17:14
  • @Panos replication, AGs put pressure on the transaction log, not tempdb. [Check this article](http://sqlschool.gr/blog/always-on-availability-groups-and-transaction-log-shrink-operation-1150.aspx) The gist is that you'll have to be prepared for larger logs. To alleviate this you should perform more frequent transaction log backups – Panagiotis Kanavos Feb 21 '19 at 17:17
  • @Panos as for how fast you can go with in-memory tables and columnstores, check [Get started with Columnstore for real time operational analytics](https://learn.microsoft.com/en-us/sql/relational-databases/indexes/get-started-with-columnstore-for-real-time-operational-analytics?view=sql-server-2017). Using a readable secondary is [described in Performance tip #2](https://learn.microsoft.com/en-us/sql/relational-databases/indexes/get-started-with-columnstore-for-real-time-operational-analytics?view=sql-server-2017#performance-tip-2-offload-analytics-to-always-on-readable-secondary) – Panagiotis Kanavos Feb 21 '19 at 17:19
  • @PanagiotisKanavos Thanks for your help. I found an answer to my main question in Chatzipavlis's article that explains why all nodes require Full Recovery Model: "Each time a transactional action is performed on the primary, secondary requests the log blocks that contain the log records of that action. Then a log scanner worker is activated in the primary, which based on the LSNs missing from the secondary gives the corresponding Log Blocks." As for the other links, they will prove very useful as I'll probably suggest to go that route (In-memory and columnstore indices). Thanks again, Panos – Panos Feb 21 '19 at 17:32

0 Answers0