- Set FastStartFailoverTarget on a database which is the target database to be switched to play the primary role after a failover.
- Set the protection mode to MaxAvailability.
- Set LogXptMode to SYNC.
- Make sure Flashback Database is ON in both databases.
- Make sure Flashback Database are all enabled at tablespace-level in both databases.
- Check Data Guard Broker configuration.
- Set a target standby database at database-level in the broker configuration.
- Set allowable time for data applying lag between both databases.
- Set allowable time for observer lose connection from primary database, if disconnection time exceeds the threshold, it will trigger a fast-start failover.
- Open another console to start observer.
- Enable fast-start failover.
- Check the broker configuration.
- Check Fast-Start Failover configuration.
SQL> SELECT flashback_on FROM v$database;
FLASHBACK_ON
------------------
YES
SQL> SELECT name, flashback_on FROM v$tablespace;
NAME FLA
------------------------------ ---
SYSTEM YES
SYSAUX YES
UNDOTBS1 YES
USERS YES
TEMP YES
EXAMPLE YES
UNDOTBS2 YES
7 rows selected.
$ dgmgrl
DGMGRL> connect sys/password@primdb1
DGMGRL> show configuration verbose;
Configuration - COMPDR
Protection Mode: MaxPerformance
Databases:
primdb - Primary database
standb - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
So far, Fast-Start Failover have not been enabled.
DGMGRL> EDIT DATABASE primdb SET PROPERTY FastStartFailoverTarget='standb';
Property "faststartfailovertarget" updated
DGMGRL> EDIT DATABASE standb SET PROPERTY FastStartFailoverTarget='primdb';
Property "faststartfailovertarget" updated
DGMGRL> show database verbose primdb;
Database - primdb
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
primdb1
primdb2
Properties:
DGConnectIdentifier = 'primdb'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '30'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'standb, primdb'
LogFileNameConvert = ''
FastStartFailoverTarget = 'standb'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
SidName(*)
StaticConnectIdentifier(*)
StandbyArchiveLocation(*)
AlternateLocation(*)
LogArchiveTrace(*)
LogArchiveFormat(*)
TopWaitEvents(*)
(*) - Please check specific instance for the property value
Database Status:
SUCCESS
DGMGRL> show database verbose standb;
Database - standb
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
standb1 (apply instance)
standb2
Properties:
DGConnectIdentifier = 'standb'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '30'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'primdb, standb'
LogFileNameConvert = ''
FastStartFailoverTarget = 'primdb'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
SidName(*)
StaticConnectIdentifier(*)
StandbyArchiveLocation(*)
AlternateLocation(*)
LogArchiveTrace(*)
LogArchiveFormat(*)
TopWaitEvents(*)
(*) - Please check specific instance for the property value
Database Status:
SUCCESS
If you have only one standby database, you can skip this step.
DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverLagLimit=30;
Property "faststartfailoverlaglimit" updated
DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold=60;
Property "faststartfailoverthreshold" updated
$ dgmgrl
DGMGRL> connect sys@standb1
DGMGRL> START OBSERVER;
Observer started
Since the observer will listen the status of all databases after started, therefore, it's normal for the prompt stays unreturned, and please don't interrupt it.
An observer is usually the third party beside primary and standby to judge the health condition of data guard, but we can choose any one of the nodes to play the role, in this case, we choose the first node (standby01.example.com) of standby RAC.
The observer can be enabled before or after enabling fast-start failover. In our case, we choose to enable observer first.
DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.
DGMGRL> show configuration verbose;
Configuration - COMPDR
Protection Mode: MaxPerformance
Databases:
primdb - Primary database
standb - (*) Physical standby database
(*) Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 60 seconds
Target: standb
Observer: standby01.example.com
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configuration Status:
SUCCESS
DGMGRL> SHOW FAST_START FAILOVER;
Fast-Start Failover: ENABLED
Threshold: 60 seconds
Target: standb
Observer: standby01.example.com
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
All of the configurations are correctly set.
When fast-start is ready, let's simulate a blackout on primary database to test fast-failover function. Our goal is to failover to standby database by the broker when primary database is blackout, and then reinstate the latter standby database automatically after it goes to mount state.
- Keep observer console on the screen.
- Check current applying lag on standby database.
- Shutdown abort the primary database to simulate a blackout.
- Watch the failover process on the observer.
- Manually startup current standby (former primary) database to mount state. Currently, the current standby database is down due to a blackout. The broker will not perform reinstatement on it until it's in the mount state.
- Keep watching the failover process on observer, you can see the broker is trying to reinstate current standby database.
- Restart current standby database (the former primary database). Since the broker startup only one instance of current standby RAC after reinstatement to receive and apply redo logs, you would be better to restart all instances of the current standby database to mount state.
- Check status after a successful fast-start failover.
- On current primary database
- On current standby database
- Overall Data Guard Broker status.
- Switchback to production environment.
DGMGRL> start observer;
Observer started
SQL> column name format a10;
SQL> column value format a12;
SQL> column datum_time format a20;
SQL> column time_computed format a20;
SQL> SELECT name, value, datum_time, time_computed FROM V$DATAGUARD_STATS WHERE name like 'apply lag';
NAME VALUE DATUM_TIME TIME_COMPUTED
---------- ------------ -------------------- --------------------
apply lag +00 00:00:00 11/03/2012 10:13:20 11/03/2012 10:13:35
$ srvctl stop database -d compdb -o abort
After waiting for 60 seconds, the observer is displaying failover message.
...
10:14:22.73 Saturday, November 03, 2012
Initiating Fast-Start Failover to database "standb"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "standb"
10:15:44.17 Saturday, November 03, 2012
You can see the failover is successful.
$ srvctl start database -d compdb -o mount
...
10:17:33.30 Saturday, November 03, 2012
Initiating reinstatement for database "primdb"...
Reinstating database "primdb", please wait...
Operation requires shutdown of instance "primdb2" on database "primdb"
Shutting down instance "primdb2"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "primdb2" on database "primdb"
Starting instance "primdb2"...
ORACLE instance started.
Database mounted.
Continuing to reinstate database "primdb" ...
Reinstatement of database "primdb" succeeded
10:20:45.15 Saturday, November 03, 2012
It shows the reinstatement is successful, the data guard architecture is sustained and data synchronization is back to normal.
$ srvctl stop database -d compdb
$ srvctl start database -d compdb -o mount
SQL> select inst_id, open_mode, database_role, switchover_status from gv$database;
INST_ID OPEN_MODE DATABASE_ROLE SWITCHOVER_STATUS
---------- -------------------- ---------------- --------------------
2 READ WRITE PRIMARY TO STANDBY
1 READ WRITE PRIMARY TO STANDBY
SQL> select inst_id, open_mode, database_role, switchover_status from gv$database;
INST_ID OPEN_MODE DATABASE_ROLE SWITCHOVER_STATUS
---------- -------------------- ---------------- --------------------
2 MOUNTED PHYSICAL STANDBY NOT ALLOWED
1 MOUNTED PHYSICAL STANDBY NOT ALLOWED
DGMGRL> SHOW FAST_START FAILOVER;
Fast-Start Failover: ENABLED
Threshold: 60 seconds
Target: primdb
Observer: standby01.example.com
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
DGMGRL> SHOW CONFIGURATION VERBOSE;
Configuration - COMPDR
Protection Mode: MaxPerformance
Databases:
standb - Primary database
primdb - (*) Physical standby database
(*) Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 60 seconds
Target: primdb
Observer: standby01.example.com
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configuration Status:
SUCCESS
DGMGRL> SHOW DATABASE VERBOSE primdb;
Database - primdb
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
primdb1
primdb2 (apply instance)
Properties:
DGConnectIdentifier = 'primdb'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '30'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'standb, primdb'
LogFileNameConvert = ''
FastStartFailoverTarget = 'standb'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
SidName(*)
StaticConnectIdentifier(*)
StandbyArchiveLocation(*)
AlternateLocation(*)
LogArchiveTrace(*)
LogArchiveFormat(*)
TopWaitEvents(*)
(*) - Please check specific instance for the property value
Database Status:
SUCCESS
DGMGRL> SHOW DATABASE VERBOSE standb;
Database - standb
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
standb1
standb2
Properties:
DGConnectIdentifier = 'standb'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '30'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'primdb, standb'
LogFileNameConvert = ''
FastStartFailoverTarget = 'primdb'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
SidName(*)
StaticConnectIdentifier(*)
StandbyArchiveLocation(*)
AlternateLocation(*)
LogArchiveTrace(*)
LogArchiveFormat(*)
TopWaitEvents(*)
(*) - Please check specific instance for the property value
Database Status:
SUCCESS
DGMGRL>
DGMGRL> switchover to primdb;
Performing switchover NOW, please wait...
Operation requires a connection to instance "primdb2" on database "primdb"
Connecting to instance "primdb2"...
Connected.
Operation requires shutdown of instance "standb2" on database "standb"
Shutting down instance "standb2"...
Database closed.
Database dismounted.
ORACLE instance shut down.
New primary database "primdb" is opening...
Operation requires shutdown of instance "standb1" on database "standb"
Shutting down instance "standb1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "standb1" on database "standb"
Starting instance "standb1"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "primdb"
DGMGRL> show configuration;
Configuration - COMPDR
Protection Mode: MaxPerformance
Databases:
primdb - Primary database
standb - (*) Physical standby database
Fast-Start Failover: ENABLED
Configuration Status:
SUCCESS
If you do enable Flashback Database, but leave one or more tablespaces to be disabled on the feature, the reinstatement will fail like this:
...
21:30:01.61 Saturday, November 03, 2012
Initiating reinstatement for database "primdb"...
Reinstating database "primdb", please wait...
Error: ORA-16653: failed to reinstate database
Failed.
Reinstatement of database "primdb" failed
Though the failover is successful, but the reinstatement is failed.
It's usually enough to do a successful fast-start failover on MaxPerformance mode, but if you have availability issues, you can change the mode into MaxAvailability.
DGMGRL> DISABLE FAST_START FAILOVER;
Disabled.
DGMGRL> EDIT DATABASE 'primdb' SET PROPERTY LogXptMode=SYNC;
Property "logxptmode" updated
DGMGRL> EDIT DATABASE 'standb' SET PROPERTY LogXptMode=SYNC;
Property "logxptmode" updated
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
Succeeded.
DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.
DGMGRL> show configuration verbose
Configuration - COMPDR
Protection Mode: MaxAvailability
Databases:
primdb - Primary database
standb - (*) Physical standby database
(*) Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 60 seconds
Target: standb
Observer: standby01.example.com
Lag Limit: 30 seconds (not in use)
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configuration Status:
SUCCESS
DGMGRL> show database verbose primdb
Database - primdb
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
primdb1
primdb2
Properties:
DGConnectIdentifier = 'primdb'
ObserverConnectIdentifier = ''
LogXptMode = 'sync'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '30'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'standb, primdb'
LogFileNameConvert = ''
FastStartFailoverTarget = 'standb'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
SidName(*)
StaticConnectIdentifier(*)
StandbyArchiveLocation(*)
AlternateLocation(*)
LogArchiveTrace(*)
LogArchiveFormat(*)
TopWaitEvents(*)
(*) - Please check specific instance for the property value
Database Status:
SUCCESS
DGMGRL> show database verbose standb
Database - standb
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
standb1 (apply instance)
standb2
Properties:
DGConnectIdentifier = 'standb'
ObserverConnectIdentifier = ''
LogXptMode = 'sync'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '30'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'primdb, standb'
LogFileNameConvert = ''
FastStartFailoverTarget = 'primdb'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName(*)
SidName(*)
StaticConnectIdentifier(*)
StandbyArchiveLocation(*)
AlternateLocation(*)
LogArchiveTrace(*)
LogArchiveFormat(*)
TopWaitEvents(*)
(*) - Please check specific instance for the property value
Database Status:
SUCCESS
Fast-start failover can control the steps of failover automatically, it's fast and easy to setup, and almost no human intervention is needed during failover. As a DBA, you must watch out a false failover very closely, a false failover due to temporary network problems results an unnecessary failover and it costs your time to recover or maybe your credits.
In case a false failover happens during a planned shutdown, you should take one of the following preventive steps to secure the planned shutdown:
- Disable fast-start failover.
- Remain fast-start failover enabled, but stop the observer.
- Remain fast-start failover enabled and observer started, but shutdown the standby database first.