In Oracle MAA, data guard plays a very important role to secure our data on the remote location. Since it involves network and interconnection between two databases, it is pretty complicated to do troubleshooting. That's why you might see ORA-16055: FAL request rejected at times.
The network problem could include host, router, firewall, LAN, VPN, IDS, IPS and leased line, etc. Any node on the route could cost your time to find out the problem.
The reaction from the standby database is also important, it's not purely accepters. Don't expect it that can digest everything transported from FAL_SERVER.
Now, let's see how I found ORA-16055: FAL request rejected and how I handled it.
ORA-16055: FAL request rejected
Found an error in the alert log:
...
Sun Feb 14 19:19:24 2010
ARC7: Begin FAL archive (thread 1 sequence 235013 destination STANDB1)
Creating archive destination LOG_ARCHIVE_DEST_2: 'STANDB1'
ARC7: FAL archive, error 270 closing archivelog file 'STANDB1'
Sun Feb 14 19:19:36 2010
Errors in file /oracle/admin/PRIMARY01/bdump/primdb1_arc7_1288.trc:
ORA-00270: error creating archive log
ARC7: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
Sun Feb 14 19:19:36 2010
ORACLE Instance primdb1 - Archival Error. Archiver continuing.
ARCH: Connecting to console port...
Sun Feb 14 19:19:36 2010
ORA-16055: FAL request rejected
ARCH: Connecting to console port...
Sun Feb 14 19:19:36 2010
Errors in file /oracle/admin/PRIMARY01/bdump/primdb1_arc7_1288.trc:
ORA-16055: FAL request rejected
...
From the aspect of primary database, this message implies that I tried to communicate with the standby database, but it does not respond to my message.
Let's check the error code ORA-16055 first:
[oracle@test ~]$ oerr ora 16055
16055, 0000, "FAL request rejected"
// *Cause: Fetch archive log (FAL) server rejected a redo gap fetch request
// from the client. This may have various causes.
// *Action: Check the alert log on the primary database. Take the appropriate
// action to solve the problem.
"This may have various causes". Yes, it's right. The possible causes could be very dependent. The above message do not offer much useful information.
This error appeared in the alert log of the primary database, but we should take appropriate actions on the standby database.
If ORA-16055 occurred only once in alert log, you don't have to worry about it, FAL request could have been back to normal. If ORA-16055: FAL request rejected repeated itself in the alert log, you should solve the problems, although the problem might not be critical at this moment.
Doing Some Tests
The error means that some problems occurred between the primary database and the standby database. In the first place, I thought it's a network problem, but the standby database can be accessed via tnsping:
[oracle@primary01 ~] $ tnsping STANDB1 3
...
Used parameter files:
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.4.23.32)(PORT = 1521))) (CONNECT_DATA = (SERVER = DEDICATED) (SID = standb1) (GLOBAL_NAME = ORCL)))
OK (10 msec)
OK (10 msec)
OK (0 msec)
As we know, tnsping can test only some degrees of connection health, such as host and listener, but not service and database. For more information, you may check the post below:
TNSPING Errors Collections
Several Possibilities
In the above test, the result shows that the listener of the standby database is online and working at this moment, but the transportation from the primary database stopped. There could be several possibilities:
- The standby listener hanged or was told not to accept new connections by PMON.
- Local archived log file is missing
- Poor network condition, temporary traffic jam over leased lines.
- The standby database needs to be recovered.
- The space used by the standby database was full, either data or archived logs destination.
- Log receiving or switching was too busy for the standby database to respond.
- One of user system limits has been reached.
If your standby database opened as read-only, it could hit the maximum processes. Or, one possibility, the listener hanged.
You should make sure the archived log files required by the standby database are still there (primary database).
Intermittent network could result the error, you should gauge the network condition.
There could be a damaged data file needed to be recovered, or the broker is dysfunctional.
The standby database will accept no more redo changes from the primary database.
If the standby server is working with not only the database, but also other services which could make OS exhausted. You should watch them closely.
The reached resource limit is usually the maximum number of open files or maximum size of file created.
Cleaning Archived Logs
Finally, we found the space of archived logs of the standby database was full, so we released the space of archived logs, the transportation service began to work automatically. You will see the working message of FAL in the alert log like this:
ARCX: Begin FAL archive ...
ARCX: Complete FAL archive ...
...
Sun Feb 14 21:17:08 2010
ARC4: Begin FAL archive (thread 1 sequence 235020 destination STANDB1)
Creating archive destination LOG_ARCHIVE_DEST_2: 'STANDB1'
Sun Feb 14 21:17:21 2010
ARC1: Complete FAL archive (thread 1 sequence 235019 destination STANDB1)
Sun Feb 14 21:17:21 2010
ARC5: Begin FAL archive (thread 1 sequence 235021 destination STANDB1)
Creating archive destination LOG_ARCHIVE_DEST_2: 'STANDB1'
Sun Feb 14 21:21:23 2010
ARC4: Complete FAL archive (thread 1 sequence 235020 destination STANDB1)
...
No more ORA-16055: FAL request rejected.
Спасибо за информацию!!!!!
Nice Article, Thank you
You’re welcome!