β

Troubleshooting the crsd.bin and listener process

ANBOB 345 阅读

Recently, we met several times Oracle Listener terminal case, here I will be sharing again due to configuration Listener whitelist caused, For to system hardening, we are eanble listener whitelist (TCP.VALIDNODE_CHECKING = yes) on some RAC Environment, But no more after prolonged listening terminated unexpectedly after enable VALIDNODE_CHECKING .


[oracle@anbob2:/home/oracle]# ps -ef|grep lsnr
  oracle 5111994 4719182   0 13:55:31  pts/3  0:00 grep lsnr

[oracle@anbob2:/home/oracle]# ps -ef|grep smon
  oracle 3080330       1   0 03:15:11      -  0:03 ora_smon_weejar2
    root 5505986       1   4 03:14:44      - 10:03 /oracle/app/11.2.0.3/grid/bin/osysmond.bin

Note:
You can see the Listener process does not exits no longer, the first to solve the problem and then analyze the reasons.


anbob2:/home/grid> lsnrctl start

LSNRCTL for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production on 25-JUN-2015 13:58:33

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

Starting /oracle/app/11.2.0.3/grid/bin/tnslsnr: please wait...

TNSLSNR for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
System parameter file is /oracle/app/11.2.0.3/grid/network/admin/listener.ora
Log messages written to /oracle/app/grid/diag/tnslsnr/anbob2/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
Error listening on: (ADDRESS=(PROTOCOL=tcp)(IP=LOOPBACK))
TNS-01191: Failed to initialize the local OS authentication subsystem
 TNS-12560: TNS:protocol adapter error
  TNS-00584: Valid node checking configuration error


# /etc/hosts
.96.60.13    anbob1
.96.60.113   anbob1-vip
.168.60.13   anbob1-pri
.96.60.14    anbob2
.96.60.114   anbob2-vip
.96.60.206   anbob-scan

  
anbob2:/oracle/app/11.2.0.3/grid/network/admin> vi sqlnet.ora

NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)

ADR_BASE = /oracle/app/grid
TCP.INVITED_NODES=(10.120.159.102,133.96.60.113,133.96.60.114,133.96.60.13,133.96.60.14,133.96.60.17,133.96.60.19)
TCP.VALIDNODE_CHECKING=yes

Note:
Through the above IP list of INVITED_NODES,currently they are include public ip and VIP both nodes, Through the above error message should be about SQLNET.ora file error, at first to disable the sqlnet.ora. by the way , you can use the following syntax, using wildcards way in ORACLE Listener 11g and above. like the following:

tcp.validnode_checking = yes
tcp.invited_nodes = (133.96.93.*)


anbob2:/oracle/app/11.2.0.3/grid/network/admin> mv sqlnet.ora sqlnet.ora_bak
anbob2:/oracle/app/11.2.0.3/grid/network/admin> lsnrctl start

LSNRCTL for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production on 25-JUN-2015 13:59:40

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

Starting /oracle/app/11.2.0.3/grid/bin/tnslsnr: please wait...

TNSLSNR for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
System parameter file is /oracle/app/11.2.0.3/grid/network/admin/listener.ora
Log messages written to /oracle/app/grid/diag/tnslsnr/anbob2/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
Start Date                25-JUN-2015 13:59:40
Uptime                    0 days 0 hr. 0 min. 2 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      ON
Listener Parameter File   /oracle/app/11.2.0.3/grid/network/admin/listener.ora
Listener Log File         /oracle/app/grid/diag/tnslsnr/anbob2/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
The listener supports no services
The command completed successfully

Note:
The Listener process already started, but the service does not register on the Listener dynamic,and we to manual registration is also unsuccessful.


anbob2:/oracle/app/11.2.0.3/grid/network/admin> sqlplus dbmt/dbmt_dba123@133.96.60.14/weejar.anbob.com

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jun 25 14:04:35 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

ERROR:
ORA-12541: TNS:no listener

To check CRS status


anbob2:/home/grid> crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services <<<
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

anbob2:/home/grid> crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

anbob2:/home/grid> crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
        OFFLINE OFFLINE                               Instance Shutdown   
ora.cluster_interconnect.haip
        ONLINE  ONLINE       anbob2                                       
ora.crf
        ONLINE  ONLINE       anbob2                                       
ora.crsd
        ONLINE  OFFLINE                                                   
ora.cssd
        ONLINE  ONLINE       anbob2                                       
ora.cssdmonitor
        ONLINE  ONLINE       anbob2                                       
ora.ctssd
        ONLINE  ONLINE       anbob2                   OBSERVER            
ora.diskmon
        OFFLINE OFFLINE                                                   
ora.drivers.acfs
        ONLINE  OFFLINE                                                   
ora.evmd
        ONLINE  ONLINE       anbob2                                       
ora.gipcd
        ONLINE  ONLINE       anbob2                                       
ora.gpnpd
        ONLINE  ONLINE       anbob2                                       
ora.mdnsd
        ONLINE  ONLINE       anbob2 
      
Try to restart the CRS as the following

anbob2:/> /oracle/app/11.2.0.3/grid/bin/crsctl stop crs -f

anbob2:/> crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
anbob2:/> crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       anbob1                                       
               ONLINE  ONLINE       anbob2                                       
ora.asm
               OFFLINE OFFLINE      anbob1                   Instance Shutdown   
               OFFLINE OFFLINE      anbob2                   Instance Shutdown   
ora.gsd
               OFFLINE OFFLINE      anbob1                                       
               OFFLINE OFFLINE      anbob2                                       
ora.net1.network
               ONLINE  ONLINE       anbob1                                       
               ONLINE  ONLINE       anbob2                                       
ora.ons
               ONLINE  ONLINE       anbob1                                       
               ONLINE  ONLINE       anbob2                                       
ora.registry.acfs
               OFFLINE OFFLINE      anbob1                                       
               OFFLINE OFFLINE      anbob2                                       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
        ONLINE  ONLINE       anbob1                                       
ora.cvu
        ONLINE  ONLINE       anbob1                                       
ora.anbob1.vip
        ONLINE  ONLINE       anbob1                                       
ora.anbob2.vip
        ONLINE  ONLINE       anbob2                                       
ora.oc4j
        ONLINE  ONLINE       anbob1                                       
ora.scan1.vip
        ONLINE  ONLINE       anbob1    
      
      
SQL> startup
SQL> alter system register;
 
anbob2:/home/grid> lsnrctl status
LSNRCTL for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production on 25-JUN-2015 14:18:40
Copyright (c) 1991, 2011, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
Start Date                25-JUN-2015 14:03:29
Uptime                    0 days 0 hr. 15 min. 11 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      ON
Listener Parameter File   /oracle/app/11.2.0.3/grid/network/admin/listener.ora
Listener Log File         /oracle/app/grid/diag/tnslsnr/anbob2/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=133.96.60.14)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=133.96.60.114)(PORT=1521)))
Services Summary...
Service "weejar.anbob.com" has 1 instance(s).
  Instance "weejar2", status READY, has 1 handler(s) for this service...
Service "weejar_XPT.anbob.com" has 1 instance(s).
  Instance "weejar2", status READY, has 1 handler(s) for this service...
The command completed successfully

Note:
After the CRS restart everything to be work ,Now we to troubleshoot the cause of the problem . the following ware CRS and OCSSD log the problems period

# cssd.log

-06-25 03:15:28.680
[evmd(3146336)]CRS-1401:EVMD started on node anbob2.
-06-25 03:15:29.721
[/oracle/app/11.2.0.3/grid/bin/orarootagent.bin(3408580)]CRS-5018:(:CLSN00037:) Removed unused HAIP route:  169.254.255.255 / 255.255.0.0 / 169.254.67.180 / en9
-06-25 03:15:31.085
[crsd(3670228)]CRS-0813:Cluster Ready Service aborted due to failure to initialize the network layer with error [clsclisten failed with ret 3
(File: caa_Socket.cpp, line: 525
]. Details at (:CRSD00133:) in /oracle/app/11.2.0.3/grid/log/anbob2/crsd/crsd.log.
-06-25 03:15:31.199
[ohasd(1900762)]CRS-2765:Resource 'ora.crsd' has failed on server 'anbob2'.
-06-25 03:15:38.047
[crsd(2556174)]CRS-1012:The OCR service started on node anbob2.
-06-25 03:15:39.829
[crsd(2556174)]CRS-0813:Cluster Ready Service aborted due to failure to initialize the network layer with error [clsclisten failed with ret 3
(File: caa_Socket.cpp, line: 525
]. Details at (:CRSD00133:) in /oracle/app/11.2.0.3/grid/log/anbob2/crsd/crsd.log.
-06-25 03:15:39.940
[ohasd(1900762)]CRS-2765:Resource 'ora.crsd' has failed on server 'anbob2'.
-06-25 03:15:46.780
[crsd(2753712)]CRS-1012:The OCR service started on node anbob2.
-06-25 03:15:47.427

# crsd.log

-06-25 03:15:28.924: [    CRSD][1] AuthLoc /oracle/app/11.2.0.3/grid/auth/crs/anbob2
-06-25 03:15:28.924: [    CRSD][1] PE active version: 11.2.0.3.0
-06-25 03:15:28.924: [    CRSD][1] PE Engine: NEW
-06-25 03:15:28.924: [    CRSD][1] Using OCR batch ops : ENABLED
-06-25 03:15:28.924: [ CRSMAIN][1] Creating RTI lock info...
-06-25 03:15:28.924: [ CRSMAIN][1] Initializing EVMMgr
-06-25 03:15:29.099: [ COMMCRS][6683]clsc_connect: (11177bff0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
-06-25 03:15:30.493: [ CRSMAIN][1] Getting local nodename...
[   CLWAL][1]clsw_Initialize: OLR initlevel [70000]
-06-25 03:15:31.008: [ COMMCRS][6941]clsclisten: Error listening on: (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.60.14)(PORT=0))  <<<<<< priv ip
-06-25 03:15:31.008: [ COMMCRS][6941]clsclisten: op 65 failed, NSerr (12560, 0), transport: (584, 0, 0)
-06-25 03:15:31.086: [ CRSMAIN][1] Created alert : (:CRSD00133:) :  Unable to get E2E port, error: IOException : clsclisten failed with ret 3
(File: caa_Socket.cpp, line: 525
-06-25 03:15:31.087: [    CRSD][1][PANIC] CRSD exiting: Unable to get E2E port after 2nd attempt
-06-25 03:15:31.087: [    CRSD][1] Done.
-06-25 03:15:31.565: [ default][1] First attempt: init CSS context succeeded.
[  clsdmt][515]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=anbob2DBG_CRSD))
-06-25 03:15:31.601: [  clsdmt][515]PID for the Process [2556174], connkey 1
-06-25 03:15:31.602: [  clsdmt][515]Creating PID [2556174] file for home /oracle/app/11.2.0.3/grid host anbob2 bin crs to /oracle/app/11.2.0.3/grid/crs/i
nit/
-06-25 03:15:31.602: [  clsdmt][515]Writing PID [2556174] to the file [/oracle/app/11.2.0.3/grid/crs/init/anbob2.pid]
-06-25 03:15:32.313: [ default][515] Policy Engine is not initialized yet!
-06-25 03:15:32.313: [ default][1] CRS Daemon Starting
-06-25 03:15:32.319: [ default][515] Policy Engine is not initialized yet!
-06-25 03:15:32.323: [ default][1] ENV Logging level for Module: AGENT  1
-06-25 03:15:32.323: [ default][1] ENV Logging level for Module: AGFW  0

Cause
This is related to an invalid host name listed in the "TCP.INVITED_NODES" setting in the sqlnet.ora file

Solution
The CRS and Listener processes terminal is caused by when enable the Listener TCP.VALIDNODE_CHECKING later in sqlnet.ora,
To ensure TCP.INVITED_NODES in addition to include the Public IP, VIP both Nodes outer also include private interconnect ip (also recommended add the SCAN IP, if it exists), when you have configured the Private IP in INVITED_NODES, to check SQLNET.ORA file "NAMES.DIRECTORY_PATH = (TNSNAMES,EZCONNECT)", make sure no spaces in parentheses.

作者:ANBOB
A No Bad Oracle Blog
原文地址:Troubleshooting the crsd.bin and listener process, 感谢原作者分享。

发表评论