Apply Lag Increasing
After initiating data guard service by broker, we saw something unusual about redo applying.
DGMGRL> show database ORCLSTB
Database - ORCLSTB
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 55 minutes 26 seconds (computed 34 seconds ago)
Apply Lag: 55 minutes 26 seconds (computed 34 seconds ago)
Average Apply Rate: 2.78 MByte/s
Real Time Query: OFF
Instance(s):
ORCLSTB1 (apply instance)
ORCLSTB2
Database Status:
SUCCESS
As we can see, the apply lag is increasing. In most running cases, it's because the applying process of standby database was interrupted by something like server reboot. So my first idea is to enable the standby database again.
DGMGRL> enable database orclstb
Enabled.
Usually, it works, but not this case.
Let's see whether managed recovery process (MRP) is working or not.
SQL> select process, status, thread#, sequence#, block#, blocks from v$managed_standby;
PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS
--------- ------------ ---------- ---------- ---------- ----------
DGRD ALLOCATED 0 0 0 0
ARCH CONNECTED 0 0 0 0
DGRD ALLOCATED 0 0 0 0
ARCH CONNECTED 0 0 0 0
ARCH CONNECTED 0 0 0 0
ARCH CONNECTED 0 0 0 0
RFS IDLE 1 0 0 0
RFS IDLE 1 21 13461 1
RFS IDLE 0 0 0 0
MRP0 WAIT_FOR_LOG 2 22 0 0
10 rows selected.
MRP is online but it's waiting for log and doing nothing, this is the archived redo applying, not real-time applying. This symptom usually happens when there's no standby redo log. So next, let's check standby redo logs.
SQL> select group#, thread#, sequence#, used, status from v$standby_log;
GROUP# THREAD# SEQUENCE# USED STATUS
---------- ---------- ---------- ---------- ----------
101 1 0 0 UNASSIGNED
102 2 0 0 UNASSIGNED
103 1 0 0 UNASSIGNED
104 2 0 0 UNASSIGNED
105 1 0 0 UNASSIGNED
106 2 0 0 UNASSIGNED
6 rows selected.
For extensibility of redo logs, I used to use 3 digits to number standby redo groups.
Unused Standby Redo Logs
We do have some standby logs, but they were created long before the standby controlfile was created. Which means, they were created in the primary database. The standby controlfile just inherits the records from the primary controlfile. In such situation, the standby database won't use them for standby redo logs, that's why real-time apply is not working.
Solutions
To enable real-time apply, we should recreate (drop then create) standby redo log groups for database in standby role, more specifically, in standby control file. We cannot count on standby redo logs created in primary database.
Stop Data Guard Service
We should stop data guard synchronization service in broker.
DGMGRL> edit database ORCLCDB set state='TRANSPORT-OFF';
Succeeded.
DGMGRL> edit database ORCLSTB set state='APPLY-OFF';
Succeeded.
Then we make sure that MRP is stopped.
SQL> select process, status, thread#, sequence#, block#, blocks from v$managed_standby where process like '%MRP%';
no rows selected
Recreate Standby Redo Logfiles
Let's see current logfiles.
SQL> column member format a30;
SQL> select * from v$logfile;
GROUP# STATUS TYPE MEMBER IS_ CON_ID
---------- ------- ------- ------------------------------ --- ----------
2 ONLINE +DATA/ORCLCDB/redo02a.log NO 0
2 ONLINE +DATA/ORCLCDB/redo02b.log NO 0
1 ONLINE +DATA/ORCLCDB/redo01a.log NO 0
1 ONLINE +DATA/ORCLCDB/redo01b.log NO 0
3 ONLINE +DATA/ORCLCDB/redo03a.log NO 0
3 ONLINE +DATA/ORCLCDB/redo03b.log NO 0
4 ONLINE +DATA/ORCLCDB/redo04a.log NO 0
4 ONLINE +DATA/ORCLCDB/redo04b.log NO 0
101 STANDBY +DATA/ORCLCDB/redo101a.log NO 0
102 STANDBY +DATA/ORCLCDB/redo102a.log NO 0
103 STANDBY +DATA/ORCLCDB/redo103a.log NO 0
GROUP# STATUS TYPE MEMBER IS_ CON_ID
---------- ------- ------- ------------------------------ --- ----------
104 STANDBY +DATA/ORCLCDB/redo104a.log NO 0
105 STANDBY +DATA/ORCLCDB/redo105a.log NO 0
106 STANDBY +DATA/ORCLCDB/redo106a.log NO 0
14 rows selected.
We do have standby redo logs, but they are useless as I said above.
Drop Standby Redo Log Groups
We dropped 6 standby logfiles from group 101 to 106.
SQL> alter database drop logfile group 101;
Database altered.
SQL> alter database drop logfile group 102;
Database altered.
SQL> alter database drop logfile group 103;
Database altered.
SQL> alter database drop logfile group 104;
Database altered.
SQL> alter database drop logfile group 105;
Database altered.
SQL> alter database drop logfile group 106;
Database altered.
Please note that, keyword STANDBY in such dropping logfile statements is optional.
Let's check logfile again.
SQL> select group#, member from v$logfile;
GROUP# MEMBER
---------- ----------------------------------------
2 +DATA/ORCLCDB/redo02a.log
2 +DATA/ORCLCDB/redo02b.log
1 +DATA/ORCLCDB/redo01a.log
1 +DATA/ORCLCDB/redo01b.log
3 +DATA/ORCLCDB/redo03a.log
3 +DATA/ORCLCDB/redo03b.log
4 +DATA/ORCLCDB/redo04a.log
4 +DATA/ORCLCDB/redo04b.log
8 rows selected.
Standby redo log groups are removed.
Add Standby Redo Logs Back
I usually use 3 digits to number standby logfile groups.
SQL> alter database add standby logfile thread 1 group 101 '+DATA/ORCLCDB/redo101a.log' size 2048m reuse;
Database altered.
SQL> alter database add standby logfile thread 2 group 102 '+DATA/ORCLCDB/redo102a.log' size 2048m reuse;
Database altered.
SQL> alter database add standby logfile thread 1 group 103 '+DATA/ORCLCDB/redo103a.log' size 2048m reuse;
Database altered.
SQL> alter database add standby logfile thread 2 group 104 '+DATA/ORCLCDB/redo104a.log' size 2048m reuse;
Database altered.
SQL> alter database add standby logfile thread 1 group 105 '+DATA/ORCLCDB/redo105a.log' size 2048m reuse;
Database altered.
SQL> alter database add standby logfile thread 2 group 106 '+DATA/ORCLCDB/redo106a.log' size 2048m reuse;
Database altered.
To correctly create logfile in this case, we should specify REUSE keyword to modify the size clause.
Please note that, the size of standby logfile should be the same as the online redo logfile's.
Let's check logfile again.
SQL> select * from v$logfile;
GROUP# STATUS TYPE MEMBER IS_ CON_ID
---------- ------- ------- ------------------------------ --- ----------
2 ONLINE +DATA/ORCLCDB/redo02a.log NO 0
2 ONLINE +DATA/ORCLCDB/redo02b.log NO 0
1 ONLINE +DATA/ORCLCDB/redo01a.log NO 0
1 ONLINE +DATA/ORCLCDB/redo01b.log NO 0
3 ONLINE +DATA/ORCLCDB/redo03a.log NO 0
3 ONLINE +DATA/ORCLCDB/redo03b.log NO 0
4 ONLINE +DATA/ORCLCDB/redo04a.log NO 0
4 ONLINE +DATA/ORCLCDB/redo04b.log NO 0
101 STANDBY +DATA/ORCLCDB/redo101a.log NO 0
102 STANDBY +DATA/ORCLCDB/redo102a.log NO 0
103 STANDBY +DATA/ORCLCDB/redo103a.log NO 0
GROUP# STATUS TYPE MEMBER IS_ CON_ID
---------- ------- ------- ------------------------------ --- ----------
104 STANDBY +DATA/ORCLCDB/redo104a.log NO 0
105 STANDBY +DATA/ORCLCDB/redo105a.log NO 0
106 STANDBY +DATA/ORCLCDB/redo106a.log NO 0
As we can see, the newly created standby redo logs are not so different from the old files, but I know they are essentially different now.
Start Data Guard Service
We start data guard in order to verify the result.
DGMGRL> edit database ORCLSTB set state='APPLY-ON';
Succeeded.
DGMGRL> edit database ORCLCDB set state='TRANSPORT-ON';
Succeeded.
After a while, we can check the status of standby database.
DGMGRL> show database ORCLSTB
Database - ORCLSTB
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 0 seconds (computed 0 seconds ago)
Average Apply Rate: 521.00 KByte/s
Real Time Query: OFF
Instance(s):
ORCLSTB1 (apply instance)
ORCLSTB2
Database Status:
SUCCESS
The apply lag is zero which means the real-time apply is back to work.
Verify Result
Let's see the status of MRP.
SQL> select process, status, thread#, sequence#, block#, blocks from v$managed_standby;
PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS
--------- ------------ ---------- ---------- ---------- ----------
DGRD ALLOCATED 0 0 0 0
ARCH CONNECTED 0 0 0 0
DGRD ALLOCATED 0 0 0 0
ARCH CONNECTED 0 0 0 0
ARCH CONNECTED 0 0 0 0
ARCH CONNECTED 0 0 0 0
MRP0 APPLYING_LOG 2 23 20879 4194304
RFS IDLE 1 22 9594 1
RFS IDLE 0 0 0 0
RFS IDLE 1 0 0 0
10 rows selected.
Let's see the status of standby redo logs.
SQL> select group#, thread#, sequence#, used, status from v$standby_log;
GROUP# THREAD# SEQUENCE# USED STATUS
---------- ---------- ---------- ---------- ----------
101 1 22 4962816 ACTIVE
102 2 23 19822080 ACTIVE
103 1 0 0 UNASSIGNED
104 2 0 0 UNASSIGNED
105 1 0 0 UNASSIGNED
106 2 0 0 UNASSIGNED
6 rows selected.
They are working now.