Tuesday, July 30, 2019

How do identify a transport error in a Data Guard setup using the broker


Applicable to Oracle 12.2

If the broker has been setup, check the configuration basics:
dgmgrl /
show configuration;
Configuration - DGConfig1

  Protection Mode: MaxPerformance
  Members:
  proddb01       - Primary database
    Error: ORA-16778: redo transport error for one or more members

    proddb01_stby1 - Physical standby database
      Warning: ORA-16809: multiple warnings detected for the member

    proddb01_stby2 - Physical standby database
      Warning: ORA-16809: multiple warnings detected for the member

In the configuration above, I have a primary + two physical standby databases.

Check the database:
DGMGRL> show database proddb01;

Database - proddb01

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    proddb01
      Error: ORA-16737: the redo transport service for member "proddb01_stby1" has an error

Use the 'LogXptStatus' monitorable property to check for transporation errors:
DGMGRL> show database proddb01 'LogXptStatus'
LOG TRANSPORT STATUS
PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME     STATUS                ERROR
            proddb01       proddb01_stby1      ERROR ORA-16191: Primary log shipping client not logged on standby
            proddb01       proddb01_stby2      VALID

The error ORA-16191 was in my case related to password files not being identical on the standby and the primary database.

For the standby databases, you can use the monitorable property 'RecvQEntries', which returns a table indicating all log files that were received by the standby database but have not yet been applied:
DGMGRL>  show database proddb01_stby1 'RecvQEntries'
STANDBY_RECEIVE_QUEUE
              STATUS     RESETLOGS_ID           THREAD              LOG_SEQ       TIME_GENERATED       TIME_COMPLETED        FIRST_CHANGE#         NEXT_CHANGE#       SIZE (KBs)
         NOT_APPLIED        894183587                1               110406  07/19/2019 10:28:10  07/19/2019 10:29:19          73008920461          73009040943           424661
         NOT_APPLIED        894183587                1               110408  07/19/2019 10:30:35  07/19/2019 10:31:50          73009169966          73009301192           442727
         NOT_APPLIED        894183587                1               110409  07/19/2019 10:31:50  07/19/2019 10:33:14          73009301192          73009443216           426723
         NOT_APPLIED        894183587                1               110410  07/19/2019 10:33:14  07/19/2019 10:34:32          73009443216  

The same information can be obtained by checking the view v$archived_log, like this:
select sequence#
from v$archived_log
where registrar = 'RFS'
and   applied = 'NO';

The actual lag that these non-applied logs causes, can be found with the query:
set lines 200
col source_db_unique_name format a20
col value format a20
SELECT source_db_unique_name,name,value,unit,time_computed,datum_time
FROM v$dataguard_stats
WHERE name = 'apply lag';

The RecvQEntries property is documented here

The LogXptStatus property is documentet here

2 comments:

  1. Thank you for Practical example and straightforward info

    ReplyDelete
  2. Very good and useful article thanks

    ReplyDelete