How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]

2023-10-27

How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]  

  Modified 21-MAY-2010     Type HOWTO     Status PUBLISHED  

In this Document
  Goal
  Solution
     Start up sequence:
     Cluster status
     Case 1: OHASD.BIN does not start
     Case 2: OHASD Agents does not start
     Case 3: CSSD.BIN does not start
     Case 4: CRSD.BIN does not start
     Case 5: GPNPD.BIN does not start
     Case 6: Various other daemons does not start
     Case 7: CRSD Agents does not start
     Network and Naming Resolution Verification
     Log File Location, Ownership and Permission
     Network Socket File Location, Ownership and Permission
     Diagnostic file collection
  References


 

 

Applies to:

Oracle Server - Enterprise Edition - Version: 11.2.0.1 and later   [Release: 11.2 and later ]
Information in this document applies to any platform.

Goal

This goal of the note is to provide reference to troubleshoot 11gR2 Grid Infrastructure clusterware startup issues. It applies to issues in both new environments (during root.sh or rootupgrade.sh) and unhealthy existing environments.  To look specifically at root.sh issues, see Note: 1053970.1 for more information. 

Solution

Start up sequence:

In a nutshell, the operating system starts ohasd, ohasd starts agents to start up daemons (gipcd, mdnsd, gpnpd, ctssd, ocssd, crsd, evmd asm etc), and crsd starts agents that start user resources (database, SCAN, listener etc).

For detailed Grid Infrastructure clusterware startup sequence, please refer to note 1053147.1

Cluster status


To find out cluster and daemon status:

$GRID_HOME/crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

$GRID_HOME/crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac1                  Started
ora.crsd
      1        ONLINE  ONLINE       rac1
ora.cssd
      1        ONLINE  ONLINE       rac1
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1
ora.ctssd
      1        ONLINE  ONLINE       rac1                  OBSERVER
ora.diskmon
      1        ONLINE  ONLINE       rac1
ora.drivers.acfs
      1        ONLINE  ONLINE       rac1
ora.evmd
      1        ONLINE  ONLINE       rac1
ora.gipcd
      1        ONLINE  ONLINE       rac1
ora.gpnpd
      1        ONLINE  ONLINE       rac1
ora.mdnsd
      1        ONLINE  ONLINE       rac1


Case 1: OHASD.BIN does not start


As ohasd.bin is responsible to start up all other cluserware processes directly or indirectly, it needs to start up properly for the rest of the stack to come up.

Automatic ohasd.bin start up depends on the following:

1. OS is at appropriate run level:

OS need to be at specified run level before CRS will try to start up.

To find out at which run level the clusterware needs to come up:

cat /etc/inittab|grep init.ohasd
h1: 35 :respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null


Above example shows CRS suppose to run at run level 3 and 5; please note depend on platform, CRS comes up at different run level.

To find out current run level:

who -r


2. "init.ohasd run" is up

On Linux/UNIX, as "init.ohasd run" is configured in /etc/inittab, process init (pid 1, /sbin/init on Linux, Solaris and hp-ux, /usr/sbin/init on AIX) will start and respawn "init.ohasd run" if it fails. Without "init.ohasd run" up and running, ohasd.bin will not start:

ps -ef|grep init.ohasd|grep -v grep
root      2279     1  0 18:14 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run


3. Cluserware auto start is enabled - its enabled by default

By default CRS is enabled for auto start upon node reboot, to enable:

$GRID_HOME/bin/crsctl enable crs

To verify whether its currently enabled or not:

cat $SCRBASE/$HOSTNAME/root/ohasdstr
enable

SCRBASE is /etc/oracle/scls_scr on Linux and AIX, /var/opt/oracle/scls_scr on hp-ux and Solaris

Note: NEVER EDIT THE FILE MANUALLY, use "crsctl enable/disable crs" command instead.

4. File System thats GRID_HOME resides is online when init script S96ohasd is executed; once S96ohasd is executed, following message should be in OS messages file:

Jan 20 20:46:51 rac1 logger: Oracle HA daemon is enabled for autostart.
..
Jan 20 20:46:57 rac1 logger: exec /ocw/grid/perl/bin/perl -I/ocw/grid/perl/lib /ocw/grid/bin/crswrapexece.pl /ocw/grid/crs/install/s_crsconfig_rac1_env.txt /ocw/grid/bin/ohasd.bin "reboot"


If you see the first line, but not the last line, likely the filesystem containing the GRID_HOME was not online while S96ohasd is executed.

5. Oracle Local Registry (OLR, $GRID_HOME/cdata/${HOSTNAME}.olr) is accessible

ls -l $GRID_HOME/cdata/*.olr
-rw------- 1 root  oinstall 272756736 Feb  2 18:20 rac1.olr


If the OLR is inaccessible or corrupted, likely ohasd.log will have similar messages like following:

..
2010-01-24 22:59:10.470: [ default][1373676464] Initializing OLR
2010-01-24 22:59:10.472: [  OCROSD][1373676464]utopen:6m':failed in stat OCR file/disk /ocw/grid/cdata/rac1.olr, errno=2, os err string=No such file or directory
2010-01-24 22:59:10.472: [  OCROSD][1373676464]utopen:7:failed to open any OCR file/disk, errno=2, os err string=No such file or directory
2010-01-24 22:59:10.473: [  OCRRAW][1373676464]proprinit: Could not open raw device
2010-01-24 22:59:10.473: [  OCRAPI][1373676464]a_init:16!: Backend init unsuccessful : [26]
2010-01-24 22:59:10.473: [  CRSOCR][1373676464] OCR context init failure.  Error: PROCL-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]
2010-01-24 22:59:10.473: [ default][1373676464] OLR initalization failured, rc=26
2010-01-24 22:59:10.474: [ default][1373676464]Created alert : (:OHAS00106:) :  Failed to initialize Oracle Local Registry
2010-01-24 22:59:10.474: [ default][1373676464][PANIC] OHASD exiting; Could not init OLR


OR

..
2010-01-24 23:01:46.275: [  OCROSD][1228334000]utread:3: Problem reading buffer 1907f000 buflen 4096 retval 0 phy_offset 102400 retry 5
2010-01-24 23:01:46.275: [  OCRRAW][1228334000]propriogid:1_1: Failed to read the whole bootblock. Assumes invalid format.
2010-01-24 23:01:46.275: [  OCRRAW][1228334000]proprioini: all disks are not OCR/OLR formatted
2010-01-24 23:01:46.275: [  OCRRAW][1228334000]proprinit: Could not open raw device
2010-01-24 23:01:46.275: [  OCRAPI][1228334000]a_init:16!: Backend init unsuccessful : [26]
2010-01-24 23:01:46.276: [  CRSOCR][1228334000] OCR context init failure.  Error: PROCL-26: Error while accessing the physical storage
2010-01-24 23:01:46.276: [ default][1228334000] OLR initalization failured, rc=26
2010-01-24 23:01:46.276: [ default][1228334000]Created alert : (:OHAS00106:) :  Failed to initialize Oracle Local Registry
2010-01-24 23:01:46.277: [ default][1228334000][PANIC] OHASD exiting; Could not init OLR

6. ohasd.bin is able to access network socket files, refer to " Network Socket File Location, Ownership and Permission " section for example output.

Case 2: OHASD Agents does not start


OHASD.BIN will spawn four agents/monitors to start level resource:

  oraagent : responsible for ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd etc
  orarootagent : responsible for ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs etc
  cssdagent / cssdmonitor : responsible for ora.cssd(for ocssd.bin) and ora.cssdmonitor(for cssdmonitor itself)

If ohasd.bin can not start any of above agents properly, clusterware will not come to healthy state; common causes of agent failure are that the log file or log directory for the agents don't have proper ownership or permission.

Refer to below section " Log File Location, Ownership and Permission " for general reference.
 

Case 3: CSSD.BIN does not start


Successful cssd.bin startup depends on the following:

1. GPnP profile is accessible - gpnpd needs to be fully up to serve profile

If ocssd.bin is able to get the profile successfully, likely ocssd.log will have similar messages like following:

2010-02-02 18:00:16.251: [    GPnP][408926240]clsgpnpm_exchange: [at clsgpnpm.c:1175] Calling "ipc://GPNPD_rac1", try 4 of 500...
2010-02-02 18:00:16.263: [    GPnP][408926240]clsgpnp_profileVerifyForCall: [at clsgpnp.c:1867] Result: (87) CLSGPNP_SIG_VALPEER. Profile verified.  prf=0x165160d0
2010-02-02 18:00:16.263: [    GPnP][408926240]clsgpnp_profileGetSequenceRef: [at clsgpnp.c:841] Result: (0) CLSGPNP_OK. seq of p=0x165160d0 is '6'=6
2010-02-02 18:00:16.263: [    GPnP][408926240]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2186] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD_rac1" disco ""

Otherwise messages like following will show in ocssd.log

2010-02-03 22:26:17.057: [    GPnP][3852126240]clsgpnpm_connect: [at clsgpnpm.c:1100] GIPC gipcretConnectionRefused (29) gipcConnect(ipc-ipc://GPNPD_rac1)
2010-02-03 22:26:17.057: [    GPnP][3852126240]clsgpnpm_connect: [at clsgpnpm.c:1101] Result: (48) CLSGPNP_COMM_ERR. Failed to connect to call url "ipc://GPNPD_rac1"
2010-02-03 22:26:17.057: [    GPnP][3852126240]clsgpnp_getProfileEx: [at clsgpnp.c:546] Result: (13) CLSGPNP_NO_DAEMON. Can't get GPnP service profile from local GPnP daemon
2010-02-03 22:26:17.057: [ default][3852126240]Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2010-02-03 22:26:17.057: [    CSSD][3852126240] clsgpnp_getProfile failed , rc(13)


2. Voting Disk is accessible

In 11gR2, ocssd.bin discover voting disk with setting from GPnP profile, if not enough voting disks can be identified, ocssd.bin will abort itself.

2010-02-03 22:37:22.212: [    CSSD][2330355744]clssnmReadDiscoveryProfile: voting file discovery string(/share/storage/di*)
..
2010-02-03 22:37:22.227: [    CSSD][1145538880] clssnmvDiskVerify: Successful discovery of 0 disks
2010-02-03 22:37:22.227: [    CSSD][1145538880]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2010-02-03 22:37:22.227: [    CSSD][1145538880]clssnmvFindInitialConfigs: No voting files found
2010-02-03 22:37:22.228: [    CSSD][1145538880]###################################
2010-02-03 22:37:22.228: [    CSSD][1145538880]clssscExit: CSSD signal 11 in thread clssnmvDDiscThread

If the voting disk is located on a non-ASM device, ownership and permissions should be:

-rw-r----- 1 ogrid oinstall 21004288 Feb  4 09:13 votedisk1

3. Network is functional and name resolution is working:

If ocssd.bin can't bind to any network, likely the ocssd.log will have messages like following:

2010-02-03 23:26:25.804: [GIPCXCPT][1206540320]gipcmodGipcPassInitializeNetwork: failed to find any interfaces in clsinet, ret gipcretFail (1)
2010-02-03 23:26:25.804: [GIPCGMOD][1206540320]gipcmodGipcPassInitializeNetwork: EXCEPTION[ ret gipcretFail (1) ]  failed to determine host from clsinet, using default
..
2010-02-03 23:26:25.810: [    CSSD][1206540320]clsssclsnrsetup: gipcEndpoint failed, rc 39
2010-02-03 23:26:25.811: [    CSSD][1206540320]clssnmOpenGIPCEndp: failed to listen on gipc addr gipc://rac1:nm_eotcs- ret 39
2010-02-03 23:26:25.811: [    CSSD][1206540320]clssscmain: failed to open gipc endp


To validate network, please refer to note 1054902.1


4. Vendor clusterware is up (if using vendor clusterware)

Grid Infrastructure provide full clusterware functionality and doesn't need Vendor clusterware to be installed; but if you happened to have Grid Infrastructure on top of Vendor clusterware in your environment, then Vendor clusterware need to come up fully before CRS can be started, to verify:

$GRID_HOME/bin/lsnodes -n

Before the cluserware is installed, execute the command below:

$INSTALL_SOURCE/install/lsnodes -v

Case 4: CRSD.BIN does not start


Successful crsd.bin startup depends on the following:

1. ocssd is fully up

If ocssd.bin is not fully up, crsd.log will show messages like following:

2010-02-03 22:37:51.638: [ CSSCLNT][1548456880]clssscConnect: gipc request failed with 29 (0x16)
2010-02-03 22:37:51.638: [ CSSCLNT][1548456880]clsssInitNative: connect failed, rc 29
2010-02-03 22:37:51.639: [  CRSRTI][1548456880] CSS is not ready. Received status 3 from CSS. Waiting for good status ..



2. OCR is accessible

If the OCR is located on ASM and it's unavailable, likely the crsd.log will show messages like:

2010-02-03 22:22:55.186: [  OCRASM][2603807664]proprasmo: Error in open/create file in dg [GI]
[  OCRASM][2603807664]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup

2010-02-03 22:22:55.189: [  OCRASM][2603807664]proprasmo: kgfoCheckMount returned [7]
2010-02-03 22:22:55.189: [  OCRASM][2603807664]proprasmo: The ASM instance is down
2010-02-03 22:22:55.190: [  OCRRAW][2603807664]proprioo: Failed to open [+GI]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2010-02-03 22:22:55.190: [  OCRRAW][2603807664]proprioo: No OCR/OLR devices are usable
2010-02-03 22:22:55.190: [  OCRASM][2603807664]proprasmcl: asmhandle is NULL
2010-02-03 22:22:55.190: [  OCRRAW][2603807664]proprinit: Could not open raw device
2010-02-03 22:22:55.190: [  OCRASM][2603807664]proprasmcl: asmhandle is NULL
2010-02-03 22:22:55.190: [  OCRAPI][2603807664]a_init:16!: Backend init unsuccessful : [26]
2010-02-03 22:22:55.190: [  CRSOCR][2603807664] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup
] [7]
2010-02-03 22:22:55.190: [    CRSD][2603807664][PANIC] CRSD exiting: Could not init OCR, code: 26

Note: in 11.2 ASM starts before crsd.bin, and brings up the diskgroup automatically if it contains the OCR.


If the OCR is located on a non-ASM device, expected ownership and permissions are:

-rw-r----- 1 root  oinstall  272756736 Feb  3 23:24 ocr


If OCR is located on non-ASM device and its unavailable, likely crsd.log will show similar message like following:

2010-02-03 23:14:33.583: [  OCROSD][2346668976]utopen:7:failed to open any OCR file/disk, errno=2, os err string=No such file or directory
2010-02-03 23:14:33.583: [  OCRRAW][2346668976]proprinit: Could not open raw device
2010-02-03 23:14:33.583: [ default][2346668976]a_init:7!: Backend init unsuccessful : [26]
2010-02-03 23:14:34.587: [  OCROSD][2346668976]utopen:6m':failed in stat OCR file/disk /share/storage/ocr, errno=2, os err string=No such file or directory
2010-02-03 23:14:34.587: [  OCROSD][2346668976]utopen:7:failed to open any OCR file/disk, errno=2, os err string=No such file or directory
2010-02-03 23:14:34.587: [  OCRRAW][2346668976]proprinit: Could not open raw device
2010-02-03 23:14:34.587: [ default][2346668976]a_init:7!: Backend init unsuccessful : [26]
2010-02-03 23:14:35.589: [    CRSD][2346668976][PANIC] CRSD exiting: OCR device cannot be initialized, error: 1:26


If the OCR is corrupted, likely crsd.log will show messages like the following:

2010-02-03 23:19:38.417: [ default][3360863152]a_init:7!: Backend init unsuccessful : [26]
2010-02-03 23:19:39.429: [  OCRRAW][3360863152]propriogid:1_2: INVALID FORMAT
2010-02-03 23:19:39.429: [  OCRRAW][3360863152]proprioini: all disks are not OCR/OLR formatted
2010-02-03 23:19:39.429: [  OCRRAW][3360863152]proprinit: Could not open raw device
2010-02-03 23:19:39.429: [ default][3360863152]a_init:7!: Backend init unsuccessful : [26]
2010-02-03 23:19:40.432: [    CRSD][3360863152][PANIC] CRSD exiting: OCR device cannot be initialized, error: 1:26


If owner or group of grid user got changed, even ASM is available, likely crsd.log will show following:

2010-03-10 11:45:12.510: [  OCRASM][611467760]proprasmo: Error in open/create file in dg [SYSTEMDG]
[  OCRASM][611467760]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=1031, loc=kgfokge
ORA-01031: insufficient privileges

2010-03-10 11:45:12.528: [  OCRASM][611467760]proprasmo: kgfoCheckMount returned [7]
2010-03-10 11:45:12.529: [  OCRASM][611467760]proprasmo: The ASM instance is down
2010-03-10 11:45:12.529: [  OCRRAW][611467760]proprioo: Failed to open [+SYSTEMDG]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2010-03-10 11:45:12.529: [  OCRRAW][611467760]proprioo: No OCR/OLR devices are usable
2010-03-10 11:45:12.529: [  OCRASM][611467760]proprasmcl: asmhandle is NULL
2010-03-10 11:45:12.529: [  OCRRAW][611467760]proprinit: Could not open raw device
2010-03-10 11:45:12.529: [  OCRASM][611467760]proprasmcl: asmhandle is NULL
2010-03-10 11:45:12.529: [  OCRAPI][611467760]a_init:16!: Backend init unsuccessful : [26]
2010-03-10 11:45:12.530: [  CRSOCR][611467760] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=1031, loc=kgfokge
ORA-01031: insufficient privileges
] [7]


3. Network is functional and name resolution is working:

If the network is not fully functioning, ocssd.bin may still come up, but crsd.bin may fail and the crsd.log will show messages like:

2010-02-03 23:34:28.412: [    GPnP][2235814832]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=867, tl=3, f=0
2010-02-03 23:34:28.428: [  OCRAPI][2235814832]clsu_get_private_ip_addresses: no ip addresses found.
..
2010-02-03 23:34:28.434: [  OCRAPI][2235814832]a_init:13!: Clusterware init unsuccessful : [44]
2010-02-03 23:34:28.434: [  CRSOCR][2235814832] OCR context init failure.  Error: PROC-44: Error in network address and interface operations Network address and interface operations error [7]
2010-02-03 23:34:28.434: [    CRSD][2235814832][PANIC] CRSD exiting: Could not init OCR, code: 44

Or:

2009-12-10 06:28:31.974: [  OCRMAS][20]proath_connect_master:1: could not connect to master  clsc_ret1 = 9, clsc_ret2 = 9
2009-12-10 06:28:31.974: [  OCRMAS][20]th_master:11: Could not connect to the new master
2009-12-10 06:29:01.450: [ CRSMAIN][2] Policy Engine is not initialized yet!
2009-12-10 06:29:31.489: [ CRSMAIN][2] Policy Engine is not initialized yet!

Or:

2009-12-31 00:42:08.110: [ COMMCRS][10]clsc_receive: (102b03250) Error receiving, ns (12535, 12560), transport (505, 145, 0)

To validate the network, please refer to note 1054902.1
 

Case 5: GPNPD.BIN does not start

1. Name Resolution is not working

gpnpd.bin fails with following error in gpnpd.log:

2010-05-13 12:48:11.540: [    GPnP][1171126592]clsgpnpm_exchange: [at clsgpnpm.c:1175] Calling "tcp://node2:9393", try 1 of 3...
2010-05-13 12:48:11.540: [    GPnP][1171126592]clsgpnpm_connect: [at clsgpnpm.c:1015] ENTRY
2010-05-13 12:48:11.541: [    GPnP][1171126592]clsgpnpm_connect: [at clsgpnpm.c:1066] GIPC gipcretFail (1) gipcConnect(tcp-tcp://node2:9393)
2010-05-13 12:48:11.541: [    GPnP][1171126592]clsgpnpm_connect: [at clsgpnpm.c:1067] Result: (48) CLSGPNP_COMM_ERR. Failed to connect to call url "tcp://node2:9393"

In above example, please make sure current node is able to ping "node2", and no firewall between them.

Case 6: Various other daemons does not start

Two common causes:

1. Log file or directory for the daemon doesn't have appropriate ownership or permission

If the log file or log directory for the daemon doesn't have proper ownership or permissions, usually there is no new info in the log file and the timestamp remains the same while the daemon tries to come up.

Refer to below section " Log File Location, Ownership and Permission " for general reference.

2. Network socket file doesn't have appropriate ownership or permission

In this case, the daemon log will show messages like:

2010-02-02 12:55:20.485: [ COMMCRS][1121433920] clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_GIPCD))

2010-02-02 12:55:20.485: [  clsdmt][1110944064]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_GIPCD))



Case 7: CRSD Agents does not start


CRSD.BIN will spawn two agents to start up user resource -the two agent share same name and binary as ohasd.bin agents:

  orarootagent : responsible for ora.net n .network, ora. nodename .vip, ora.scan n .vip and  ora.gns
  oraagent : responsible for ora.asm, ora.eons, ora.ons, listener, SCAN listener, diskgroup, database, service resource etc

To find out the user resource status:

$GRID_HOME/crsctl stat res -t


If crsd.bin can not start any of the above agents properly, user resources may not come up.  A common cause of agent failure is that the log file or log directory for the agents don't have proper ownership or permissions.

Refer to below section " Log File Location, Ownership and Permission " for general reference.

Network and Naming Resolution Verification


CRS depends on a fully functional network and name resolution. If the network or name resolution is not fully functioning, CRS may not come up successfully.

To validate network and name resolution setup, please refer to note 1054902.1

Log File Location, Ownership and Permission


Appropriate ownership and permission of sub-directories and files in $GRID_HOME/log is critical for CRS components to come up properly.

Assuming a Grid Infrastructure environment with node name rac1, CRS owner grid, and two separate RDBMS owner rdbmsap and rdbmsar, here's what it looks like under $GRID_HOME/log:

drwxrwxr-x 5 grid oinstall 4096 Dec  6 09:20 log
  drwxr-xr-x  2 grid oinstall 4096 Dec  6 08:36 crs
  drwxr-xr-t 17 root   oinstall 4096 Dec  6 09:22 rac1
    drwxr-x--- 2 grid oinstall  4096 Dec  6 09:20 admin
    drwxrwxr-t 4 root   oinstall  4096 Dec  6 09:20 agent
      drwxrwxrwt 7 root    oinstall 4096 Jan 26 18:15 crsd
        drwxr-xr-t 2 grid  oinstall 4096 Dec  6 09:40 application_grid
        drwxr-xr-t 2 grid  oinstall 4096 Jan 26 18:15 oraagent_grid
        drwxr-xr-t 2 rdbmsap oinstall 4096 Jan 26 18:15 oraagent_rdbmsap
        drwxr-xr-t 2 rdbmsar oinstall 4096 Jan 26 18:15 oraagent_rdbmsar
        drwxr-xr-t 2 grid  oinstall 4096 Jan 26 18:15 ora_oc4j_type_grid
        drwxr-xr-t 2 root    root     4096 Jan 26 20:09 orarootagent_root
      drwxrwxr-t 6 root oinstall 4096 Dec  6 09:24 ohasd
        drwxr-xr-t 2 grid oinstall 4096 Jan 26 18:14 oraagent_grid
        drwxr-xr-t 2 root   root     4096 Dec  6 09:24 oracssdagent_root
        drwxr-xr-t 2 root   root     4096 Dec  6 09:24 oracssdmonitor_root
        drwxr-xr-t 2 root   root     4096 Jan 26 18:14 orarootagent_root    
    -rw-rw-r-- 1 root root     12931 Jan 26 21:30 alertrac1.log
    drwxr-x--- 2 grid oinstall  4096 Jan 26 20:44 client
    drwxr-x--- 2 root oinstall  4096 Dec  6 09:24 crsd
    drwxr-x--- 2 grid oinstall  4096 Dec  6 09:24 cssd
    drwxr-x--- 2 root oinstall  4096 Dec  6 09:24 ctssd
    drwxr-x--- 2 grid oinstall  4096 Jan 26 18:14 diskmon
    drwxr-x--- 2 grid oinstall  4096 Dec  6 09:25 evmd     
    drwxr-x--- 2 grid oinstall  4096 Jan 26 21:20 gipcd     
    drwxr-x--- 2 root oinstall  4096 Dec  6 09:20 gnsd      
    drwxr-x--- 2 grid oinstall  4096 Jan 26 20:58 gpnpd    
    drwxr-x--- 2 grid oinstall  4096 Jan 26 21:19 mdnsd    
    drwxr-x--- 2 root oinstall  4096 Jan 26 21:20 ohasd     
    drwxrwxr-t 5 grid oinstall  4096 Dec  6 09:34 racg       
      drwxrwxrwt 2 grid oinstall 4096 Dec  6 09:20 racgeut
      drwxrwxrwt 2 grid oinstall 4096 Dec  6 09:20 racgevtf
      drwxrwxrwt 2 grid oinstall 4096 Dec  6 09:20 racgmain
    drwxr-x--- 2 grid oinstall  4096 Jan 26 20:57 srvm        

Please note most log files in sub-directory inherit ownership of parent directory; and above are just for general reference to tell whether there's unexpected recursive ownership and permission changes inside the CRS home . If you have a working node with the same version, the working node should be used as a reference.

Network Socket File Location, Ownership and Permission


Network socket files can be located in /tmp/.oracle, /var/tmp/.oracle or /usr/tmp/.oracle

Assuming a Grid Infrastructure environment with node name rac1, CRS owner grid, and clustername eotcs, below is an example output from the network socket directory:

drwxrwxrwt  2 root oinstall 4096 Feb  2 21:25 .oracle

./.oracle:
drwxrwxrwt 2 root  oinstall 4096 Feb  2 21:25 .
srwxrwx--- 1 grid oinstall    0 Feb  2 18:00 master_diskmon
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 mdnsd
-rw-r--r-- 1 grid oinstall    5 Feb  2 18:00 mdnsd.pid
prw-r--r-- 1 root  root        0 Feb  2 13:33 npohasd
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 ora_gipc_GPNPD_rac1
-rw-r--r-- 1 grid oinstall    0 Feb  2 13:34 ora_gipc_GPNPD_rac1_lock
srwxrwxrwx 1 grid oinstall    0 Feb  2 13:39 s#11724.1
srwxrwxrwx 1 grid oinstall    0 Feb  2 13:39 s#11724.2
srwxrwxrwx 1 grid oinstall    0 Feb  2 13:39 s#11735.1
srwxrwxrwx 1 grid oinstall    0 Feb  2 13:39 s#11735.2
srwxrwxrwx 1 grid oinstall    0 Feb  2 13:45 s#12339.1
srwxrwxrwx 1 grid oinstall    0 Feb  2 13:45 s#12339.2
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 s#6275.1
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 s#6275.2
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 s#6276.1
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 s#6276.2
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 s#6278.1
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 s#6278.2
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 sAevm
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 sCevm
srwxrwxrwx 1 root  root        0 Feb  2 18:01 sCRSD_IPC_SOCKET_11
srwxrwxrwx 1 root  root        0 Feb  2 18:01 sCRSD_UI_SOCKET
srwxrwxrwx 1 root  root        0 Feb  2 21:25 srac1DBG_CRSD
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 srac1DBG_CSSD
srwxrwxrwx 1 root  root        0 Feb  2 18:00 srac1DBG_CTSSD
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 srac1DBG_EVMD
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 srac1DBG_GIPCD
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 srac1DBG_GPNPD
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 srac1DBG_MDNSD
srwxrwxrwx 1 root  root        0 Feb  2 18:00 srac1DBG_OHASD
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 sLISTENER
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 sLISTENER_SCAN2
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:01 sLISTENER_SCAN3
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 sOCSSD_LL_rac1_
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 sOCSSD_LL_rac1_eotcs
-rw-r--r-- 1 grid oinstall    0 Feb  2 18:00 sOCSSD_LL_rac1_eotcs_lock
-rw-r--r-- 1 grid oinstall    0 Feb  2 18:00 sOCSSD_LL_rac1__lock
srwxrwxrwx 1 root  root        0 Feb  2 18:00 sOHASD_IPC_SOCKET_11
srwxrwxrwx 1 root  root        0 Feb  2 18:00 sOHASD_UI_SOCKET
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 sOracle_CSS_LclLstnr_eotcs_1
-rw-r--r-- 1 grid oinstall    0 Feb  2 18:00 sOracle_CSS_LclLstnr_eotcs_1_lock
srwxrwxrwx 1 root  root        0 Feb  2 18:01 sora_crsqs
srwxrwxrwx 1 root  root        0 Feb  2 18:00 sprocr_local_conn_0_PROC
srwxrwxrwx 1 root  root        0 Feb  2 18:00 sprocr_local_conn_0_PROL
srwxrwxrwx 1 grid oinstall    0 Feb  2 18:00 sSYSTEM.evm.acceptor.auth


Diagnostic file collection


If the issue can't be identified with the note, as root, please run $GRID_HOME/bin/diagcollection.sh on all nodes, and upload all .gz files it generated in current directory.


References

NOTE:1053970.1 - Troubleshooting 11.2 Grid Infastructure Installation Root.sh Issues
NOTE:1054902.1 - How to Validate Network and Name Resolution Setup for the Clusterware and RAC
NOTE:1068835.1 - What to Do if 11gR2 Clusterware is Unhealthy
NOTE:942166.1 - How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation
NOTE:969254.1 - How to Proceed from Failed Upgrade to 11gR2 Grid Infrastructure (CRS)

Show Attachments Attachments

 

Show Related Information Related


Products
  • Oracle Database Products > Oracle Database > Oracle Database > Oracle Server - Enterprise Edition
Keywords
OCR; ASM; CRS; INFRASTRUCTURE; GRID; CLUSTERWARE; CLUSTER~READY~SERVICES; VOTING~DISKS
Errors
ORA-1031; ORA-15077; CRS-4529; CRS-4533; CRS-4638; CRS-4537

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1] 的相关文章

  • Microsoft VC++2010 的使用

    xff08 by 朵在薪哩 xff09 1 下载以及安装 1 点击这个下载文件 2 选择 下一步 下一步 安装 即可 2 使用前的准备以及新建 1 在左下角的菜单找到 Microsoft VC 43 43 2010 点击并打开 2 添加 开
  • 2012年10月管理计算机系统,2010年10月全国高等教育自学考试管理系统中计算机应用真题...

    全国2010年10月高等教育自学考试 管理系统中计算机应用试题 课程代码 xff1a 00051 一 单项选择题 本大题共30小题 xff0c 每小题1分 xff0c 共30分 在每小题列出的四个备选项中只有一个是符合题目要求的 xff0c
  • vs 2010 专业版 密钥

    YCFHQ 9DWCY DKV88 T2TMH G7BHP 转载于 https www cnblogs com daretodream archive 2013 04 02 2995147 html
  • photoshop 去除水印的六种方法

    一 使用仿制图章工具去除文字这是比较常用的方法 具体的操作是 选取仿制图章工具 按住Alt键 在无文字区域点击相似的色彩名图案采样 然后在文字区域拖动鼠标复制以覆盖文字 要注意的是 采样点即为复制的起始点 选择不同的笔刷直径会影响绘制的范围
  • VS2010 如何添加H文件目录和LIB目录

    第一次使用VS2010 也是初学者开始编写VC 程序首先学习编写DLL文件 编译完自己的DLL文件后 要在其它项目中使用 开始遇到很多错 但是在网上搜索了好久后 终于解决了问题 H文件目录 依次点击 项目 XX 属性 C C 常规 在 附加
  • 电阻的精度和温漂

    转载自 http blog sina com cn s blog 68b345970100jc2h html 电阻的精度和温漂 1 电阻温度系数 TCR 表示电阻当温度改变1度时 电阻值的相对变化 当温度每升高1 时 导体电阻的增加值与原来
  • C#中DataGridView编辑状态控制

    DataGridView的编辑状态可以根据需求任意设置 1 设置 DataGridView1为只读dgv ReadOnly true 此时 用户的新增行操作和删除行操作也被屏蔽了 2 设置 DataGridView的第n列整列单元格为只读d
  • 28句话,句句触动你的心

    author skate time 2010 06 01 1 莎士比亚说 再好的东西都有失去的一天 再深的记忆也有淡忘的一天 再爱的人 也有远走的一天 再美的梦也有苏醒的一天 该放弃的决不挽留 该珍惜的决不放手 分手后不可以做朋友 因为彼此
  • TurboPower Async Professional 在Delphi2010及Delphi7中的安装

    这里我们介绍一下TurboPower Async Professional 串口控件的安装方法 Delphi 2010 1 下载http sourceforge net projects tpapro 2 解压 在Delphi2010下找到
  • Windows7+WDK+VS2010+VisualDDK驱动开发环境搭建(菜鸟的经验)

    自己在研究驱动开发 第一步就是开发环境的搭建 网上已有很多的教程一 我也是按着教程一步一步搭建的 但在搭建过程的过程当中遇到一些问题 也花了我不少时间 第一个难题就是 我是Windows7 VS2010 WDK的开发环境 首先我参考了网上的
  • linux用rdate命令实现同步时间

    author skate time 2010 05 07 用rdate命令实现同步时间 前两天说到用ntp时间服务器和ntpdate命令同步时间 今天简单记录下用rdate同步时间 http blog csdn net wyzxg arch
  • OpenGL点精灵(Point Sprite)

    From http iiunknown blogbus com logs 48250551 html 在很多粒子的demo中 看到把粒子显示成一个个小球 如果你以为那是用glutSolidShpere画出来的话 你就错了 上万个粒子 每个球
  • VC6.0 MFC点击编辑框弹出对话框

    在写界面的用于触屏时 往往需要编辑框弹出盘来进行输入 下面就将我思路写一下吧 1 刚开始在网让找一些这方面的资料 结果在论坛中有发现这样一个帖子 见面的链接 http topic csdn net u 20100630 15 728f2d7
  • WinCE5.0显卡驱动修改笔记

    WinCE5 0显卡驱动修改笔记公司前段时间让我在Geode上安装一个CE5 0 我把系统安装好之后发现显卡驱动不支持开发板的屏幕 我们的屏幕是800x480的 所以我只能自己动手写修改了一下驱动让它能够支持800x480 一下是我对驱动的
  • Merge into的使用详解-你Merge了没有

    Merge是一个非常有用的功能 类似于Mysql里的insert into on duplicate key Oracle在9i引入了merge命令 通过这个merge你能够在一个SQL语句中对一个表同时执行inserts和updates操
  • WTL for VS2010/VS2010Express

    纠结wtl升级问题已经1个月了 装了vs2010才知道wtl很难兼容vs2010 不想换回vs2008了 今天找到了wtl的svn下载了最新的代码终于可以在vs2010里面顺利使用wtl了 http wtl svn sourceforge
  • 西门子PPI通讯协议

    过硬件和软件侦听的方法 分析PLC内部固有的PPI通讯协议 然后上位机采用VB编程 遵循PPI通讯协议 读写PLC数据 实现人机操作任务 这种通讯方法 与一般的自由通讯协议相比 省略了PLC的通讯程序编写 只需编写上位机的通讯程序资源S7
  • BizTalk2010简介

    绝大多数现代业务流程都或多或少地依赖于其它软件 尽管其中部分流程仅由单个应用程序支持 但其他许多业务流程都依赖于不同的软件系统 在许多情况下 已使用不同的技术在不同时间 不同平台上创建了此软件 若要使这些业务程序实现自动化 则需要连接不同系
  • 社会中的学费

    上学期间 我们交学费是为了学习课本知识 我们也确实从中学到了很多有用的东西 这些学费我们觉得是物有所值的 当我们踏入社会的时候 我们也是要向社会交一定的学费 这种学费 我们能够学到的东西就是社会经验 这是东西是无形的 我们总会觉得自己的钱没
  • ERROR:unable to read the cmd header on the pmi context, Error = -1

    win7 vs2010 MPI 以下仅在单机下做的测试 电脑之前装了MPICH2和Microsoft HPC Pack 2008 SDK 用vs2010链接MPICH2的库编译了一个小程序 在cmd下用mpiexec执行该程序时出现下面问题

随机推荐

  • 概率统计21——指数分布和无记忆性

    指数分布 Exponential distribution 是一种连续型概率分布 可以用来表示独立随机事件发生的时间间隔的概率 比如婴儿出生的时间间隔 旅客进入机场的时间间隔 打进客服中心电话的时间间隔 系统出现bug的时间间隔等等 指数分
  • 给rhel9、centos-stream9 设置软件源

    我使用的是清华镜像源 https mirrors tuna tsinghua edu cn centos stream SIGs 9 stream 一 软件源添加 vim etc yum repos d CentOS Stream9 rep
  • typedef struct node的用法及与struct node的区别,为何要用typedef?

    typedef的作用是为已有的数据类型定义一个新名字 其主要目的是为了我们在使用时能用这个更加清晰简单的新名字 还有一个目的就是为了简化变量的声明 下面的几段代码具有相同的功能 都是用于链表结构体节点的定义和声明 第一种方式 struct
  • vue页面刷新或者后退参数丢失的问题

    在toB的项目中 会经常遇到列表数据筛选查询的情景 当要打开某一项的详情页或者暂时离开列表页 再返回 后退时 选择的筛选条件会全部丢失 辛辛苦苦选择好的条件全没了 还得重新选择 如果有分页的更头大 还得重新一页页翻到之前看到的那一页 用户体
  • 如何理解协方差矩阵(散布矩阵)

    这学期开了模式识别的学习课程 经常提到概率论与数理统计的一个概念 协方差矩阵 在模式识别中又叫散布矩阵 理解这个矩阵严格意义上来说其实不需要太多先导知识 我们只需要了解一些线性代数基本的概念 但是你如果不了解协方差矩阵 听模式识别的课程就会
  • 有深度的文章

    大家好 时隔半年 我将会继续更新推文 主发微信公众号 希望与大家一起交流学习 望大家多支持 你将在这里以最短的时间 获得最好的技术知识 图片是公众号二维码 谢谢大家
  • 第五章字符串总结

    5 1 String类 5 1 1 声明字符串 字符串是常量 它们可以显示任何文字信息 字符串的值在创建之后不能更改 在 Java 语言中 单引号中的内容表示字符 例如 s 而双引号中的内容则表示字符串 例如 我是字符串 123456789
  • vlc-android配置实录

    听说vlc底层也用的ffmpeg 免费开源的 业界做的不错的 就来看看 从网上找了很多例子 也从github上找了好多demo 好多都不全 或者下载下来编译失败 官网上下载的vlc android就编译失败 下面列两个可以用的 一 有vlc
  • JavaScript实现数组对应位置插入另一个数组

    系列文章目录 文章目录 系列文章目录 前言 一 使用循环遍历 二 使用concat和slice方法 三 使用splice方法 四 使用for循环 总结 前言 在JavaScript中 有时我们需要将一个数组的元素按照对应位置插入另一个数组中
  • Spring源码解析:BeanFactory深入理解

    现在一般都用ApplicantContext代替BeanFactory 说到Spring框架 人们往往大谈特谈一些似乎高逼格的东西 比如依赖注入 控制反转 面向切面等等 但是却忘记了最基本的一点 Spring的本质是一个bean工厂 bea
  • 使用思维导图快速了解 Eventbus

    详细源码解读可参考 Awsome Android
  • CentOS 上安装ClamAV

    安装epel yum y install epel release 安装ClamAV yum install y clamav clamav update 安装 yum load transaction tmp yum save tx 20
  • 医学图像分割:自动心脏诊断挑战赛项目数据集(ACDC)

    Local contrastive loss with pseudo label based self training for semi supervised medical image segmentation Krishna Chai
  • WSL2安装Docker

    推荐阅读知乎版 排版更好一点 WSL2安装Docker WSL2安装Docker比较简单 这里记录下 主要参考官方文章如下 Docker wsl2 1 确保已安装WSL2 首先确保你已经安装了WSL2 wsl l v 如果没有安装 请参考
  • 证书在手,认证无忧---证书浅析

    在IPSec VPN篇章中 我们介绍了IPSec隧道两端设备使用证书进行身份认证的内容 在刚刚推出的SSL VPN开篇中 也介绍了证书认证的相关内容 作为网络世界的 身份证 证书在身份认证的场景中已经得到了普遍应用 大家可能已经习惯了用户名
  • Ubuntu一键部署Open VN

    vpn咋就不能成为标题了 现在企业必备的技能不能用了 1 下载安装 1 1 登录root账户 su root 1 2 下载一键部署脚本 git clone https github com guoew openvpn install git
  • protobuf 下载 使用

    protobuf是谷歌开源的一种数据格式 适合高性能 对响应速度有要求的数据传输场景 因为profobuf是二进制数据格式 需要编码和解码 数据本身不具有可读性 因此只能反序列化之后得到真正可读的数据 优势 1 序列化后体积相比Json和X
  • rgba转16js代码

  • 【单片机毕业设计】【mcuclub-dz-068】基于单片机的避障小车系统设计

    最近设计了一个项目基于单片机的避障小车系统设计 与大家分享一下 一 基本介绍 项目名 基于单片机的避障小车的设计 项目编号 mcuclub dz 068 单片机 STC89C52 功能简介 1 通过超声波检测前方障碍物 如果前方出现障碍物
  • How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]

    How to Troubleshoot Grid Infrastructure Startup Issues ID 1050908 1 Modified 21 MAY 2010 Type HOWTO Status PUBLISHED In