2017-11-25 ASM Fails to Start


Overview

Even with the correct UDEV settings, ASM and CRS fail to start up.

References

Investigation

Attempt to Start ASM

When I found ASM not to be running, I attempted to start ASM manually:

[oracle@redfern1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Sat Nov 18 17:42:22 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service

There were no entries in the ASM alert log.

CRS Log

With no entries in the ASM alert log, and the message above indicating a problem with CSS, I went to the CRS log first. The CRS log ( /opt/app/oracle/diag/crs/redfern1/crs/trace/alert.log ) shows:

2017-11-18 17:27:03.996000 +11:00
2017-11-18 17:27:03.991 [OHASD(2307)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 2307
2017-11-18 17:27:03.995 [OHASD(2307)]CRS-0714: Oracle Clusterware Release 12.1.0.2.0.
2017-11-18 17:27:04.032 [OHASD(2307)]CRS-2112: The OLR service started on node redfern1.
2017-11-18 17:27:04.313 [OHASD(2307)]CRS-1301: Oracle High Availability Service started on node redfern1.
2017-11-18 17:27:04.351 [OHASD(2307)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
2017-11-18 17:27:05.559000 +11:00
2017-11-18 17:27:05.558 [CSSDMONITOR(2413)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 2413
2017-11-18 17:27:05.744 [CSSDAGENT(2411)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 2411
2017-11-18 17:27:06.360 [ORAAGENT(2405)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 2405
2017-11-18 17:27:06.929000 +11:00
2017-11-18 17:27:06.927 [ORAROOTAGENT(2409)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 2409
2017-11-18 17:27:08.413000 +11:00
2017-11-18 17:27:08.412 [ORAAGENT(2463)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 2463
2017-11-18 17:27:08.591 [MDNSD(2475)]CRS-8500: Oracle Clusterware MDNSD process is starting with operating system process ID 2475
2017-11-18 17:27:08.669 [EVMD(2478)]CRS-8500: Oracle Clusterware EVMD process is starting with operating system process ID 2478
2017-11-18 17:27:09.629000 +11:00
2017-11-18 17:27:09.628 [GPNPD(2514)]CRS-8500: Oracle Clusterware GPNPD process is starting with operating system process ID 2514
2017-11-18 17:27:10.807000 +11:00
2017-11-18 17:27:10.805 [GIPCD(2533)]CRS-8500: Oracle Clusterware GIPCD process is starting with operating system process ID 2533
2017-11-18 17:27:11.100 [GPNPD(2514)]CRS-2328: GPNPD started on node redfern1.
2017-11-18 17:27:17.670000 +11:00
2017-11-18 17:27:17.664 [CSSDMONITOR(2605)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 2605
2017-11-18 17:27:17.824 [CSSDAGENT(2622)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 2622
2017-11-18 17:27:17.830 [ORAROOTAGENT(2624)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 2624
2017-11-18 17:27:18.364 [OCSSD(2661)]CRS-8500: Oracle Clusterware OCSSD process is starting with operating system process ID 2661
2017-11-18 17:27:19.416000 +11:00
2017-11-18 17:27:19.415 [OCSSD(2661)]CRS-1713: CSSD daemon is started in hub mode
2017-11-18 17:27:19.512 [OCSSD(2661)]CRS-1714: Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /opt/app/oracle/diag/crs/redfern1/crs/trace/ocssd.trc
2017-11-18 17:27:34.525000 +11:00

OCSSD Log

Based on the entry from the CRS log highlighted above, the OCSSD log ( /opt/app/oracle/diag/crs/redfern1/crs/trace/ocssd.trc ) was examined and it shows:

2017-11-18 17:27:19.365144 :    CSSD:1471755904: (TLM) Starting CSS daemon, version 12.1.0.2.0 with uniqueness value 1510986439
2017-11-18 17:27:19.365197 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CSSD, LogLevel = 2, TraceLevel = 0
2017-11-18 17:27:19.365228 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDNMC, LogLevel = 2, TraceLevel = 0
2017-11-18 17:27:19.365252 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDGMPC, LogLevel = 2, TraceLevel = 0
2017-11-18 17:27:19.365274 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDGMCC, LogLevel = 2, TraceLevel = 0
2017-11-18 17:27:19.365294 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CSSDBCMC, LogLevel = 2, TraceLevel = 0
2017-11-18 17:27:19.365313 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CLSF, LogLevel = 0, TraceLevel = 0
2017-11-18 17:27:19.365332 :    CSSD:1471755904: clsu_load_ENV_levels: Module = SKGFD, LogLevel = 0, TraceLevel = 0
2017-11-18 17:27:19.365350 :    CSSD:1471755904: clsu_load_ENV_levels: Module = GPNP, LogLevel = 1, TraceLevel = 0
2017-11-18 17:27:19.365366 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CLSINET, LogLevel = 0, TraceLevel = 0
2017-11-18 17:27:19.365386 :    CSSD:1471755904: clsu_load_ENV_levels: Module = CLSFA, LogLevel = 3, TraceLevel = 0
    CSSD:1471755904: clsugetconf : Configuration type [4]. 
2017-11-18 17:27:19.370526 :    CSSD:1471755904: clssscGetParameterOLR: OLR fetch for parameter auth rep (9) failed with rc 21
2017-11-18 17:27:19.370530 :    CSSD:1471755904: clssscagStartAgLsnr: Failed to get auth location from OLR, constructing manually 
2017-11-18 17:27:19.370533 :    CSSD:1471755904: clssscagStartAgLsnr: auth location '/opt/app/grid_infra/12.1.0/grid/auth/css/' 
2017-11-18 17:27:19.370853 :    CSSD:1471755904: clssscGPNPInit: PERF_TIME Initializing GPNP
…
2017-11-18 17:27:19.491527 :    CSSD:1471755904: clssnmReadDiscoveryProfile: voting file discovery string()
2017-11-18 17:27:19.491529 :    CSSD:1471755904: clssnkInit: NK generic layer initializing.
2017-11-18 17:27:19.492218 :    CSSD:1312909056: clssscthrdmain: Starting thread clssnmvDDiscThread
2017-11-18 17:27:19.492401 :    GPNP:1312909056: clsgpnp_Init: [at clsgpnp0.c:654] '/opt/app/grid_infra/12.1.0/grid' in effect as GPnP home base.
2017-11-18 17:27:19.492407 :    GPNP:1312909056: clsgpnp_Init: [at clsgpnp0.c:708] GPnP pid=2661, cli=clsuGpnpg GPNP comp tracelevel=1, depcomp tracelevel=0, tlsrc:ORA_DAEMON_LOGGING_LEVELS, apitl:0, complog:1, tstenv:0, devenv:0, envopt:0, flags=2003
2017-11-18 17:27:19.494824 :    GPNP:1312909056: clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:402] Using FS Wallet Location : /opt/app/grid_infra/12.1.0/grid/gpnp/redfern1/wallets/peer/

2017-11-18 17:27:19.494838 :    GPNP:1312909056: clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:414] Wallet readable. Path: /opt/app/grid_infra/12.1.0/grid/gpnp/redfern1/wallets/peer/

2017-11-18 17:27:19.511333 :    GPNP:1312909056: clsgpnp_Term: [at clsgpnp0.c:1347] GPnP cli=clsuGpnpg
2017-11-18 17:27:19.512115 :    CLSF:1312909056: Allocated CLSF context
2017-11-18 17:27:19.512119 :    CSSD:1312909056: clssnmvDDiscThread: using discovery string  for initial discovery 
2017-11-18 17:27:19.512126 :   SKGFD:1312909056: Discovery with str::

2017-11-18 17:27:19.512130 :   SKGFD:1312909056: UFS discovery with ::

2017-11-18 17:27:19.512134 :   SKGFD:1312909056: Execute glob on the string /dev/sd*
2017-11-18 17:27:19.512186 :   SKGFD:1312909056: running stat on disk:/dev/sdf
2017-11-18 17:27:19.512233 :   SKGFD:1312909056: running stat on disk:/dev/sdd
2017-11-18 17:27:19.512237 :   SKGFD:1312909056: running stat on disk:/dev/sde
2017-11-18 17:27:19.512242 :   SKGFD:1312909056: running stat on disk:/dev/sdc
2017-11-18 17:27:19.512245 :   SKGFD:1312909056: running stat on disk:/dev/sdb
2017-11-18 17:27:19.512249 :   SKGFD:1312909056: running stat on disk:/dev/sda
2017-11-18 17:27:19.512255 :   SKGFD:1312909056: Fetching UFS disk :/dev/sda:

2017-11-18 17:27:19.512257 :    CLSF:1312909056: Ignoring 0-byte file /dev/sda

2017-11-18 17:27:19.512259 :   SKGFD:1312909056: Fetching UFS disk :/dev/sdb:

2017-11-18 17:27:19.512260 :    CLSF:1312909056: Ignoring 0-byte file /dev/sdb

2017-11-18 17:27:19.512262 :   SKGFD:1312909056: Fetching UFS disk :/dev/sdc:

2017-11-18 17:27:19.512263 :    CLSF:1312909056: Ignoring 0-byte file /dev/sdc

2017-11-18 17:27:19.512264 :   SKGFD:1312909056: Fetching UFS disk :/dev/sde:

2017-11-18 17:27:19.512266 :    CLSF:1312909056: Ignoring 0-byte file /dev/sde

2017-11-18 17:27:19.512267 :   SKGFD:1312909056: Fetching UFS disk :/dev/sdd:

2017-11-18 17:27:19.512268 :    CLSF:1312909056: Ignoring 0-byte file /dev/sdd

2017-11-18 17:27:19.512270 :   SKGFD:1312909056: Fetching UFS disk :/dev/sdf:

2017-11-18 17:27:19.512271 :    CLSF:1312909056: Ignoring 0-byte file /dev/sdf

2017-11-18 17:27:19.512272 :   SKGFD:1312909056: OSS discovery with ::

2017-11-18 17:27:19.512714 :    CSSD:1312909056: clssnmvDiskVerify: Successful discovery of 0 disks
2017-11-18 17:27:19.512717 :    CSSD:1312909056: clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2017-11-18 17:27:19.512719 :    CSSD:1312909056: clssnmvFindInitialConfigs: No voting files found
2017-11-18 17:27:19.512757 :    CSSD:1312909056: (:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds

Points to note from this extract from the CSS log:

  1. There was no OLR file set up
  2. Devices prefixed by /dev/sd* are scanned by default for the voting disk
  3. The real voting disk is mounted on /dev/xvdh

Analysis

Reviewing my system build log, I found that the installation of GI failed as described in 06 Install GI 12.1.0.2 was not successful.

Deinstall GI

Tried following the procedure in How to Deconfigure/Reconfigure(Rebuild OCR) or Deinstall Grid Infrastructure (Doc ID 1377349.1) , except that I found that perl was not installed.