Since 11g, we rely more and more on SCAN Virtual IP to load balance and failover connections among nodes in a cluster, so the original Virtual IP (VIP) is not as important as the old days. That's why VIP becomes optional for grid installation from 18c.
However, if you have VIP on each node, then SCAN listeners introduce new connections to those VIP, not public IP for local listener to establish connections. Furthermore, some systems may still depend on VIP to fail over connections at connect-time.
In reality, if one node in a cluster failed to service, then its VIP automatically failover to another available node. That is to say, the failover of VIP is basically done when its home node leaves the cluster, but the problem is, we can't use it to connect to the database. It turns out that the failover VIP is useless.
In this post, we not only show you how to failover VIP but also provide a solution to really use the failover VIP.
A. Failover Virtual IP (VIP)
I'm going to make the second node of the RAC servers leave the cluster. Let's see steps in this part.
1. Check Status before Failover
Configuration of Both VIP
[grid@primary01 ~]$ srvctl config vip -vip primary01-vip
VIP exists: network number 1, hosting node primary01
VIP Name: primary01-vip
VIP IPv4 Address: 192.168.0.111
VIP IPv6 Address:
VIP is enabled.
VIP is individually enabled on nodes:
VIP is individually disabled on nodes:
[grid@primary01 ~]$ srvctl config vip -vip primary02-vip
VIP exists: network number 1, hosting node primary02
VIP Name: primary02-vip
VIP IPv4 Address: 192.168.0.112
VIP IPv6 Address:
VIP is enabled.
VIP is individually enabled on nodes:
VIP is individually disabled on nodes:
As you can see, every VIP belongs to its home node.
192.168.0.112 primary02-vip
Resource Status of Both VIP
[grid@primary01 ~]$ crsctl status resource ora.primary01.vip
NAME=ora.primary01.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on primary01
[grid@primary01 ~]$ crsctl status resource ora.primary02.vip
NAME=ora.primary02.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on primary02
Current Status of Both VIP (srvctl status vip)
[grid@primary01 ~]$ srvctl status vip -vip primary01-vip
VIP 192.168.0.111 is enabled
VIP 192.168.0.111 is running on node: primary01
[grid@primary01 ~]$ srvctl status vip -vip primary02-vip
VIP 192.168.0.112 is enabled
VIP 192.168.0.112 is running on node: primary02
2. Make the Second Node Leave Cluster
Here we stop the services on node 2 by the normal procedure, you can also perform a blackout on node 2. They have the same effect.
Setup Grid Environment for root
We setup a grid environment for root, because we have to use root to perform some critical operations on the cluster.
[root@primary01 ~]# . /home/grid/.bash_profile
Stop Cluster on Node 2 by Root
Stopping all cluster service on node 2 needs root privilege.
[root@primary01 ~]# crsctl stop cluster -n primary02
CRS-2673: Attempting to stop 'ora.crsd' on 'primary02'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on server 'primary02'
CRS-2673: Attempting to stop 'ora.chad' on 'primary02'
CRS-2673: Attempting to stop 'ora.orclcdb.db' on 'primary02'
CRS-2677: Stop of 'ora.orclcdb.db' on 'primary02' succeeded
CRS-33673: Attempting to stop resource group 'ora.asmgroup' on server 'primary02'
CRS-2673: Attempting to stop 'ora.OCR.dg' on 'primary02'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'primary02'
CRS-2673: Attempting to stop 'ora.MGMT.dg' on 'primary02'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'primary02'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'primary02'
CRS-2677: Stop of 'ora.DATA.dg' on 'primary02' succeeded
CRS-2677: Stop of 'ora.OCR.dg' on 'primary02' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'primary02' succeeded
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.primary02.vip' on 'primary02'
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'primary02'
CRS-2677: Stop of 'ora.MGMT.dg' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'primary02'
CRS-2677: Stop of 'ora.primary02.vip' on 'primary02' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'primary02' succeeded
CRS-2677: Stop of 'ora.chad' on 'primary02' succeeded
CRS-2677: Stop of 'ora.asm' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.ASMNET1LSNR_ASM.lsnr' on 'primary02'
CRS-2677: Stop of 'ora.ASMNET1LSNR_ASM.lsnr' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.asmnet1.asmnetwork' on 'primary02'
CRS-2677: Stop of 'ora.asmnet1.asmnetwork' on 'primary02' succeeded
CRS-33677: Stop of resource group 'ora.asmgroup' on server 'primary02' succeeded.
CRS-2672: Attempting to start 'ora.scan1.vip' on 'primary01'
CRS-2672: Attempting to start 'ora.primary02.vip' on 'primary01'
CRS-2676: Start of 'ora.scan1.vip' on 'primary01' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'primary01'
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'primary01' succeeded
CRS-2676: Start of 'ora.primary02.vip' on 'primary01' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'primary02'
CRS-2677: Stop of 'ora.ons' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'primary02'
CRS-2677: Stop of 'ora.net1.network' on 'primary02' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'primary02' has completed
CRS-2677: Stop of 'ora.crsd' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.storage' on 'primary02'
CRS-2677: Stop of 'ora.storage' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'primary02'
CRS-2677: Stop of 'ora.asm' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'primary02'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'primary02'
CRS-2673: Attempting to stop 'ora.evmd' on 'primary02'
CRS-2677: Stop of 'ora.ctssd' on 'primary02' succeeded
CRS-2677: Stop of 'ora.evmd' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'primary02'
CRS-2677: Stop of 'ora.cssd' on 'primary02' succeeded
3. Check Status After Failover
Current Status of Failover VIP (srvctl status vip)
[grid@primary01 ~]$ srvctl status vip -vip primary02-vip
VIP 192.168.0.112 is enabled
VIP 192.168.0.112 is running on node: primary01
Good, it's on the first node.
Resource Status of Failover VIP
[grid@primary01 ~]$ crsctl status resource ora.primary02.vip
NAME=ora.primary02.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=INTERMEDIATE on primary01
Did you see that? The state of failover VIP is INTERMEDIATE, not ONLINE. According to the Resource State of Oracle Clusterware 19c, INTERMEDIATE means:
A resource is partially online. For example, the Oracle Database VIP resource fails over to another server when its home server leaves the cluster. However, applications cannot use this VIP to access the database while it is on a non-home server. Similarly, when an Oracle Database instance is started and not open, the resource is partially online: it is running but is not available to provide services.
Does it mean that we can't use the failover VIP? It's a bad news.
NIC Status of Failover VIP
The status of the first network interface of the first node proves that it has bound the VIP.
[grid@primary01 ~]$ ifconfig -a
...
ens33:6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.112 netmask 255.255.255.0 broadcast 192.168.0.255
ether 00:0c:29:dc:65:90 txqueuelen 1000 (Ethernet)
Yes, it did fail over to the first node.
Listener Status
[grid@primary01 ~]$ lsnrctl status
...
Listener Parameter File /u01/app/19.0.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/primary01/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.11)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.111)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=primary01)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/product/19.0.0/db_1/admin/ORCLCDB/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "+ASM_DATA" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "+ASM_MGMT" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "+ASM_OCR" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "ORCLCDB" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "ORCLCDBXDB" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "b19aa2333d82dcfbe0530c2aa8c0ee3a" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "orclpdb" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
The command completed successfully
No, we don't see 192.168.0.112 (primary02-vip) on the list of the listener. That is to say, no listener for 192.168.0.112 incoming connetions.
All facts tell us that we cannot use the failover VIP to service external connections, although VIP actually failed over to another node. It seems to be the end of story.
Next, let's see our solution.
B. Solution to Use Failover VIP
The solution is to create a new listener running upon the failover VIP. Let's see steps in this part.
- Create a listener in ORACLE_HOME by oracle
- Add Listener to Database
- Startup the New Listener
- Show Listener Status
- Test Connection
1. Create a listener in ORACLE_HOME by oracle
Please note that, we configure a new listener at database-level, not grid-level.
Node 1
[oracle@primary01 ~]$ vi $ORACLE_HOME/network/admin/listener.ora
LISTENER2 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.112)(PORT = 1521))
)
)
Node 2
[oracle@primary02 ~]$ vi $ORACLE_HOME/network/admin/listener.ora
LISTENER2 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.111)(PORT = 1521))
)
)
2. Add Listener to Database
We use multiple addresses to set LOCAL_LISTENER with both scope and all SID. This can make the database register services with the new listener.
[oracle@primary01 ~]$ sqlplus / as sysdba
...
SQL> alter system set local_listener='(ADDRESS_LIST=(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.111)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.112)(PORT = 1521)))' scope=both sid='*';
System altered.
SQL> show parameter local_listener
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
local_listener string (ADDRESS_LIST=(ADDRESS = (PROT
OCOL = TCP)(HOST = 192.168.0.1
11)(PORT = 1521))(ADDRESS = (P
ROTOCOL = TCP)(HOST = 192.168.
0.112)(PORT = 1521)))
3. Startup the New Listener
Since the VIP of node 2 has been failed over to node 1, we can start LISTENER2 at node 1. Vice versa, if node 1 failed, we should start it on node 2.
[oracle@primary01 ~]$ lsnrctl start listener2
LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 26-DEC-2020 15:32:15
Copyright (c) 1991, 2019, Oracle. All rights reserved.
Starting /u01/app/oracle/product/19.0.0/db_1/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 19.0.0.0.0 - Production
System parameter file is /u01/app/oracle/product/19.0.0/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/diag/tnslsnr/primary01/listener2/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.112)(PORT=1521)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.0.112)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias listener2
Version TNSLSNR for Linux: Version 19.0.0.0.0 - Production
Start Date 26-DEC-2020 15:32:15
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/oracle/product/19.0.0/db_1/network/admin/listener.ora
Listener Log File /u01/app/oracle/diag/tnslsnr/primary01/listener2/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.112)(PORT=1521)))
The listener supports no services
The command completed successfully
Of course, no service registers with the listener so far. It's waiting for services to register with.
4. Show Listener Status
[oracle@primary01 ~]$ lsnrctl status listener2
...
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.112)(PORT=1521)))
Services Summary...
Service "ORCLCDB" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "ORCLCDBXDB" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "b19aa2333d82dcfbe0530c2aa8c0ee3a" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "orclpdb" has 1 instance(s).
Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
The command completed successfully
All services are back.
5. Test Connection
We use SQL*Plus to test the connection of the database from a client.
C:\Users\scott>sqlplus system/[email protected]:1521/orclcdb
SQL*Plus: Release 11.2.0.4.0 Production on Wed Dec 26 15:53:17 2020
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
ORCLCDB1
We made it.
Later on, the service of LISTENER2 will become unavailable when node 2 comes back.