[root@fzppon05vs1n ~]# crsctl start crs
CRS-41053: checking Oracle Grid Infrastructure for file permission issues
PRVG-11960 : Set user ID bit is not set for file “/u01/grid/12.2.0.3/bin/extjob” on node “fzppon05vs1n”.
PRVG-2031 : Owner of file “/u01/grid/12.2.0.3/bin/extjob” did not match the expected value on node “fzppon05vs1n”. [Expected = “root(0)” ; Found = “oracle(54321)”]
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
Before you rush and change any file permission, read below solutions carefully, because most probably it’s not a permission issue!
I’ve faced this error in many occasions; each time I fix it with a different solution. And here is a list of all solutions, where anyone can work for you.
Solution #2: kill all duplicate ohasd services
[root@fzppon05vs1n ~]# ps -ef | grep -v grep| grep ‘.bin’
root 19786 1 1 06:18 ? 00:00:39 /u01/grid/12.2.0.3/bin/ohasd.bin reboot
root 19788 1 0 06:18 ? 00:00:00 /u01/grid/12.2.0.3/bin/ohasd.bin reboot
root 19850 1 0 06:18 ? 00:00:13 /u01/grid/12.2.0.3/bin/orarootagent.bin
root 19958 1 0 06:18 ? 00:00:14 /u01/grid/12.2.0.3/bin/oraagent.bin
…
Found lots of ohasd.bin are running, while it supposed to be only one ohasd.bin process
Checking all ohasd related processes:
[root@fzppon05vs1n ~]# ps -ef | grep -v grep | grep ohasd
root 1900 1 0 06:17 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null
root 1947 1900 0 06:17 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null
root 19786 1 1 06:18 ? 00:00:00 /u01/grid/12.2.0.3/bin/ohasd.bin reboot
root 19788 1 0 06:18 ? 00:00:00 /u01/grid/12.2.0.3/bin/ohasd.bin reboot
Now, let’s kill all ohasd processes and give it a try:
[root@fzppon05vs1n ~]# kill -9 1900 1947 19786 19788
or simply:
[root@fzppon05vs1n ~]# ps -ef | grep ‘init.ohasd‘ | grep -v grep | awk ‘{print $2}’ | xargs -r kill -9
Start the clusterware:
[root@fzppon05vs1n ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
Voilà! Started up.
Solution #4: Re-configuring the clusterware
Note: Re-configuring the clusterware should happen on the malfunctioning node, where it’s not supposed to do any impact on the other cluster working nodes:
# $GRID_HOME/crs/install/rootcrs.sh -deconfig -force
# $GRID_HOME/root.sh
CRS-41053 may look vague, moreover, it may mention a different file other than extjob in the error message, don’t rush and change the file’s ownership as advised by the error message,
– Second, check for any redundant running clusterware background processes and kill it, then try to startup the clusterware.
– If clusterware is still failing; restart the node and check again for any redundant processes, if found any of them try to kill and start the cluster.
– Lastly, If your clusterware still doesn’t come up, then use the sliver bullet and reconfigure the clusterware on the malfunctioning node.