We're going to introduce 3 ways that can failover to a standby database at connect-time in this post.
Using a Persistent Service Name
Connect-Time Failover is a feature of local naming method, and can be easily set up by adding a tns entry into tnsnames.ora file on the client machine, then the client can connect to the second database (listener) if the first database (listener) cannot be connected. With a persistent service name, we can use it in a data guard environment.
Here are the failures that it can help:
- When the first listener does not respond, it will try the second connection.
- When the first database is not up, it will try the second connection.
- When the first destination host cannot be reached, it will try the second connection.
In our scenario, our goal is make Oracle Net to try to connect the second address when the first address fails to be connected. Theoretically, this method can be used when a data guard failover occurred between the primary and standby database, but not a switchover. If you like a connect-time failover to survive across a data guard switchover, you need another way to do it. The advanced way is in the following article: Connect-Time Failover by a Dynamic Service Name.
Add a persistent service name in both databases
Let's add a common service_name into both databases like these:
SQL> --On the primary database
SQL> alter system set service_names='primdb.example.com,compdb.example.com' scope=both;
System altered.
SQL> show parameter service_name;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string primdb.example.com, compdb.exa
mple.com
...
SQL> --On the standby database
SQL> alter system set service_names='standb.example.com,compdb.example.com' scope=both;
System altered.
SQL> show parameter service_name;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string standb.example.com, compdb.exa
mple.com
Before modification of a tns entry
The original entry in tnsnames.ora before modification is as below.
COMPDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = primary-cluster-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = compdb.example.com)
)
)
With the entry, you can connect to only the primary RAC database via SCAN address, but if the primary fails to be connected, then the connection will end with unconnected.
Cluster SCAN is a new feature of Oracle 11g database, it can do load-balancing between nodes. SCAN also can connect an available node within same cluster during connect-time, which is like Connect-Time Failover, but not the same.
After modification of a tns entry
We added some attributes, the main attribute is ADDRESS_LIST, the sub-attributes are LOAD_BALANCE and FAILOVER.
- LOAD_BALANCE
- FAILOVER
This attribute will choose any one of addresses in the list to connect, since our goal is to failover connections from a primary database (the first address) to a standby database (the second address), so we must set LOAD_BALANCE to OFF.
This attribute will failover the connection to the other address. If you set FAILOVER to OFF, Oracle Net won't try the other address if the first attempt failed.
Let's see the modified entry.
COMPDB =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE = OFF)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = primary-cluster-scan)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = standby-cluster-scan)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = compdb.example.com)
)
)
Since the common service_name compdb.example.com is accepted by both database. You can design the tns entry to connect both databases with the same service name. If you are planning to apply Connect-Time Failover between RAC nodes, you can set LOAD_BALANCE to ON to take advantage of this feature.
More about Transparent Application Failover (TAF), you may refer to this post: How to Implement Transparent Application Failover (TAF) by Local Naming
Functions test
Next, let's test Connect-Time Failover can do through a persistent service name. In our test case, there are two RAC, one is the primary RAC, the other is the standby RAC. For testing smoothly and uninterruptedly, we start the standby RAC with "READ ONLY WITH APPLY" state, because a standby database with mount state can be connected, but cannot be used by users other than SYSDBA or SYSOPER.
- When the listeners of the primary database are down.
- When the primary database is down.
- When the all the primary nodes are down.
[oracle@primary01 ~]$ sqlplus hr/hr@compdb
...
SQL> COLUMN "INSTANCE NAME" FORMAT A15;
SQL> SELECT SYS_CONTEXT('USERENV','INSTANCE_NAME') "INSTANCE NAME" FROM DUAL;
INSTANCE NAME
---------------
primdb2
SQL> EXIT;
[oracle@primary01 ~]$ srvctl stop listener
[oracle@primary01 ~]$ sqlplus hr/hr@compdb
...
SQL> COLUMN "INSTANCE NAME" FORMAT A15;
SQL> SELECT SYS_CONTEXT('USERENV','INSTANCE_NAME') "INSTANCE NAME" FROM DUAL;
INSTANCE NAME
---------------
standb2
SQL> EXIT;
[oracle@primary01 ~]$ srvctl start listener
[oracle@primary01 ~]$ sqlplus hr/hr@compdb
...
SQL> COLUMN "INSTANCE NAME" FORMAT A15;
SQL> SELECT SYS_CONTEXT('USERENV','INSTANCE_NAME') "INSTANCE NAME" FROM DUAL;
INSTANCE NAME
---------------
primdb1
SQL> exit;
[oracle@primary01 ~]$ srvctl stop database -d compdb
[oracle@primary01 ~]$ sqlplus hr/hr@compdb
...
SQL> COLUMN "INSTANCE NAME" FORMAT A15;
SQL> SELECT SYS_CONTEXT('USERENV','INSTANCE_NAME') "INSTANCE NAME" FROM DUAL;
INSTANCE NAME
---------------
standb1
SQL> exit;
[oracle@primary01 ~]$ srvctl stop database -d compdb
[oracle@standby01 ~]$ sqlplus hr/hr@compdb
...
SQL> COLUMN "INSTANCE NAME" FORMAT A15;
SQL> SELECT SYS_CONTEXT('USERENV','INSTANCE_NAME') "INSTANCE NAME" FROM DUAL;
INSTANCE NAME
---------------
primdb2
...
[root@primary01 ~]# shutdown -h now
...
[root@primary02 ~]# shutdown -h now
...
[oracle@standby01 ~]$ sqlplus hr/hr@compdb
SQL> COLUMN "INSTANCE NAME" FORMAT A15;
SQL> SELECT SYS_CONTEXT('USERENV','INSTANCE_NAME') "INSTANCE NAME" FROM DUAL;
INSTANCE NAME
---------------
standb1
SQL> exit;
Conclusions
In my opinions, there are two architectures that you can apply the feature on the client sides.
- RAC
- Data Guard
If you are working with 9i or 10g database, there is no SCAN on server side, which can dispatch connections to an available node. The Connect-Time Failover of local naming can easily be setup on client side and help you to connect the other within same cluster. Furthermore, if you want to load-balance between nodes, you can set LOAD_BALANCE = ON.
If the primary database is down, client can connect a failovered standby database during connect-time. But you must set LOAD_BALANCE = OFF to prevent the connection stuck on standby database with mount state.
Unfortunately, this feature of local naming cannot survive across a switchover, we need another tricky way to do it, please refer to the next article.
Using a Dynamic Service Name
Adding a service name in the initialization parameter SERVICE_NAME makes the service name to be persistent across restarts, that is, every time all the databases are started, the service are also started. Which means, the method of connect-time failover through a persistent service name is unable to distinguish which database is playing the primary role from the others, it will fail to connect to the true and current primary database after a switchover. Therefore, we need to develop another Connect-Time Failover to survive across a switchover.
Finally, An Oracle document ID 740029.1 guides us through the dark, which is a workaround to make a service to be dynamically and optionally started when the database role is PRIMARY. When the service is started, it will be dynamically registered in the listener. Here are the steps:
- Check the current service name on both sides.
- Create and start a service name by DBMS_SERVICE in the primary database.
- Make sure the service is registered in the listener dynamically.
- Create a trigger to start the current primary service if the database role is PRIMARY when the database startup.
- Edit a tns entry with the dynamic service name on a client machine.
- Test connection before switchover.
- Make a switchover.
- Test connection after switchover.
SQL> --On the primary side
SQL> show parameter service_name;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string primdb.example.com
...
SQL> --On the standby side
SQL> show parameter service_name;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string standb.example.com
[oracle@primary01 ~]$ sqlplus / as sysdba
...
SQL> exec dbms_service.create_service('current_primary','current_primary');
PL/SQL procedure successfully completed.
SQL> exec dbms_service.start_service('current_primary');
PL/SQL procedure successfully completed.
SQL> exec dbms_service.start_service('current_primary','primdb2');
PL/SQL procedure successfully completed.
SQL> show parameter service_name;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string primdb.example.com, current_pr
imary
[oracle@primary01 ~]$ lsnrctl status
...
Services Summary...
...
Service "current_primary.example.com" has 1 instance(s).
Instance "primdb1", status READY, has 1 handler(s) for this service...
...
The command completed successfully
SQL> CREATE OR REPLACE TRIGGER start_current_primary_service after startup on database
2 DECLARE
3 v_role VARCHAR(30);
4 BEGIN
5 SELECT DATABASE_ROLE INTO v_role FROM V$DATABASE;
6 IF v_role = 'PRIMARY' THEN
7 DBMS_SERVICE.START_SERVICE('current_primary');
8 END IF;
9 END;
10 /
Trigger created.
...
COMPDB =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE = OFF)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = primary-cluster-scan.example.com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = standby-cluster-scan.example.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = current_primary.example.com)
)
)
...
C:\Documents and Settings\Administrator>sqlplus system/password@compdb
...
SQL> select db_unique_name from v$database;
DB_UNIQUE_NAME
------------------------------------------------------------
primdb
DGMGRL> switchover to standb
Performing switchover NOW, please wait...
Operation requires shutdown of instance "primdb2" on database "primdb"
Shutting down instance "primdb2"...
Database closed.
Database dismounted.
ORACLE instance shut down.
New primary database "standb" is opening...
Operation requires shutdown of instance "primdb1" on database "primdb"
Shutting down instance "primdb1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "primdb1" on database "primdb"
Starting instance "primdb1"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "standb"
C:\Documents and Settings\Administrator>sqlplus system/password@compdb
...
SQL> select db_unique_name from v$database;
DB_UNIQUE_NAME
------------------------------------------------------------
standb
I can say no more, this kind of Connect-Time Failover through a dynamic service name could be the best solution in a data guard environment before 11g, which can handle both failovers and switchovers.
Using a SRVCTL Configured Service
Starting from 11g, the utility SRVCTL can control service to be up or down under some specific conditions, this could be very useful for DBA to design a model of Connect-Time Failover in a data guard environment.
Our goal is to create a special service name by SRVCTL, which will be up and running if and only if the database is playing the primary role. Then we can edit a tns entry with the configured service name on client machines (or application servers) to guide all the connections to the true and current primary database.
- Check the current status of both cluster.
- Add a configured service in both the primary and standby cluster.
- Show the configuration of the newly created service.
- Check the configuration of both databases in the cluster.
- Restart both databases to make the new service running on the primary server.
- Check the listeners on the primary cluster.
- Edit a tns entry on the client machine.
- Test connection before switchover.
- Make a switchover.
- Test connection before switchover.
On the primary side.
[oracle@primary01 ~]$ srvctl status service -d compdb
[oracle@primary01 ~]$ srvctl config service -d compdb
On the standby side.
[oracle@standby01 ~]$ srvctl status service -d compdb
[oracle@standby01 ~]$ srvctl config service -d compdb
There is no service on both clusters currently.
On the primary side.
[oracle@primary01 ~]$ srvctl add service -d compdb -s primary_service -l primary -r primdb1,primdb2
On the standby side.
[oracle@standby01 ~]$ srvctl add service -d compdb -s primary_service -l primary -r standb1,standb2
This command will add a service named primary_service in the cluster, which will start when the database is started and playing the primary role.
On the primary side.
[oracle@primary01 ~]$ srvctl config service -d compdb -s primary_service
Service name: primary_service
Service is enabled
Server pool: compdb_primary_service
Cardinality: 2
Disconnect: false
Service role: primary
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Preferred instances: primdb1,primdb2
Available instances:
On the standby side.
[oracle@standby01 ~]$ srvctl config service -d compdb -s primary_service
Service name: primary_service
Service is enabled
Server pool: compdb_primary_service
Cardinality: 2
Disconnect: false
Service role: primary
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Preferred instances: standb1,standb2
Available instances:
The default value of "Management policy" is AUTOMATIC which will make the service started automatically when the database is started and playing the primary role.
On the primary side.
[oracle@primary01 ~]$ srvctl config database -d compdb
Database unique name: compdb
Database name: compdb
Oracle home: /u01/app/oracle/product/11.2.0/db_1
Oracle user: oracle
Spfile: +DATA/compdb/spfileprimdb.ora
Domain: example.com
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: compdb
Database instances: primdb1,primdb2
Disk Groups: DATA
Services: primary_service
Database is administrator managed
On the standby side.
[oracle@standby01 ~]$ srvctl config database -d compdb
Database unique name: compdb
Database name: compdb
Oracle home: /u01/app/oracle/product/11.2.0/db_1
Oracle user: oracle
Spfile: +DATA/compdb/spfilestandb.ora
Domain: example.com
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: compdb
Database instances: standb1,standb2
Disk Groups: DATA
Services: primary_service
Database is administrator managed
On the primary side.
[oracle@primary01 ~]$ srvctl stop database -d compdb
[oracle@primary01 ~]$ srvctl start database -d compdb
On the standby side.
[oracle@standby01 ~]$ srvctl stop database -d compdb
[oracle@standby01 ~]$ srvctl start database -d compdb -o mount
[oracle@primary01 ~]$ lsnrctl status
...
Service "primary_service.example.com" has 1 instance(s).
Instance "primdb1", status READY, has 1 handler(s) for this service...
...
The newly created service is up and running on the primary cluster, but not on the standby cluster.
COMPDB =
(DESCRIPTION =
(ADDRESS_LIST =
(LOAD_BALANCE = OFF)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = primary-cluster-scan.example.com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = standby-cluster-scan.example.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = primary_service.example.com)
)
)
C:\Documents and Settings\Administrator>sqlplus system/password@compdb
...
SQL> select instance_name from v$instance;
INSTANCE_NAME
--------------------------------
primdb2
[oracle@primary01 ~]$ dgmgrl sys/password@primdb1
...
DGMGRL> show database standb
Database - standb
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
standb1
standb2 (apply instance)
Database Status:
SUCCESS
DGMGRL> switchover to standb;
Performing switchover NOW, please wait...
Operation requires shutdown of instance "primdb2" on database "primdb"
Shutting down instance "primdb2"...
Database closed.
Database dismounted.
ORACLE instance shut down.
New primary database "standb" is opening...
Operation requires shutdown of instance "primdb1" on database "primdb"
Shutting down instance "primdb1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "primdb1" on database "primdb"
Starting instance "primdb1"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "standb"
C:\Documents and Settings\Administrator>sqlplus system/password@compdb
...
SQL> select instance_name from v$instance;
INSTANCE_NAME
--------------------------------
standb2
very very clear article , thank’s a lot .
You’re welcome, I’m happy to hear that.
Absolutely clear and useful description – well done.
Keep on like that! Cheers, Adam
Thank you, I appreciate the compliment.
very nice article
Thanks!
What circumstances will the tns connection failover to the 2nd address in the list? We had a situation where platinum support was patching our first exadata (couple of days work / downtime), hosts fully shutdown for linux upgrade, and the clients all just hung instead of connecting to the standby exadata, which was currently primary db. We had to change all of the client tns to point to the standby server directly.
This would also apply to the situation where primary site is completely gone. Only standby hosts survived and accessible.
Theoretically, when the first address is inaccessible, it will try the second, the third until connected.
I had a question on Connect-Time Failover to Standby Database and needed your help. How do I setup the above on a standlone cdb database with one pluggable database.?
My dataguard setup is working fine. However the auto client connection is not
There’re 3 ways in this post, you may try another way to complement it.