How to install Oracle 9iR2 RAC on SuSE Linux Enterprise Server 8 ================================================================ --- Using the Network File System (NFS) --- For example with Netapp (Network Applicane, www.netapp.com) NFS storage devices Advantage: No expensive Fiber Channel equipment needed and much easier to administer, both from the hardware and from a software point of view! This was the easiest Oracle RAC installation we have ever done (when compared with using a cluster filesystem like OCFS and with SCSI or Fiber Channel shared storage systems). We tested and wrote this documentation at Netapp HQ (Sunnyvale, California). We used a Netapp F840 for NFS storage. The ORACLE_HOME was *not* put on NFS, only the data files were shared, just like in a setup with Fiber Channel storage and raw I/O or OCFS. It is possible to share ORACLE_HOME too but we did not test it and will not attempt to describe the necessary procedures in this document. The two test nodes were two IBM xSeries 335 units, the network was Gigabit ethernet. More info about NFS options: http://www.netapp.com/tech_library/3183.html Requires at least these version numbers: * United Linux 1.0 (SuSE Linux Enterprise Server 8 is "Powered by UL 1.0") * UL Kernel update for Oracle, at least these version numbers: k_smp-2.4.19-196.i586.rpm - SMP kernel, for almost all Oracle users k_deflt-2.4.19-207.i586.rpm - Single CPU k_athlon-2.4.19-200.i586.rpm - Optimized for AMD Athlon k_debug-2.4.19-164.i586.rpm - Debug kernel k_psmp-2.4.19-201.i586.rpm - Support for *very* old Pentium CPUs kernel-source-2.4.19.SuSE-152.i586.rpm * orarun.rpm: Version 1.8 or greater Tip: There is an easy way to work on all nodes simultaneously! Simply open a KDE "Konsole" and in it open one terminal for each node. Now log into each node on each of the terminals. After that, under "View" in the "KDE Konsole" menu enable "Send Input to All Sessions" for one of the terminals. Now, whatever you type in this session is also sent to the other sessions, so that you work on all nodes simultaneously! This greatly reduces the amount of typing you have to do! If you do that, remember a few things: The node names, IPs etc. will be different on each node. The shell history may be different on each node. "vi" remembers where in a file you left off - so if you edit a file on all nodes simultaneously first check that the cursor is in the same position in the file on all terminals. And so on - check what's going on on the other terminals often (SHIFT left/right arrow makes this a very quick and painless exercise)!!! Tip: Use "sux" instead of "su" and X-server permissions and setting DISPLAY happens automatically! Tip: If you work in a noisy server room: Get a Bose noise canceling headset which is sold for frequent flyers. We found it very valuable in server rooms, too! ALL NODES --------- - Install SLES-8 and the latest Service Pack, esp. orarun (at least version 1.8) and kernel updates. Make sure to satisfy all dependencies of package "orarun"! SP1: United Linux Service Pack #1. Alternatively, get at least the orarun package from: ftp://ftp.suse.com/pub/suse/i386/supplementary/commercial/Oracle/sles-8/ Kernel- and other updates are available on the SuSE Maintenance web and can be installed conveniently via YOU (Yast Online Update). - You may have to add "acpi=off" to the boot options if the system hangs during boot! Selecting "Safe Settings" as boot option includes that. - set the password for user oracle: as root do "passwd oracle" - optional: create /home/oracle: cp -a /etc/skel /home/oracle chown oracle:oinstall /home/oracle usermod -d /home/oracle oracle - remove gcc 3.2 ("rpm -e gcc --nodeps") to be sure it's not used - we prefer an error message during installation over inadvertently using gcc 3.2 If you choose not to remove it you have to edit $ORACLE_HOME/bin/genclntsh as well as $ORACLE_HOME/bin/genagtsh and add "/opt/gcc295/bin" *in front* of the PATH variable set in those scripts! Then do "relink all". - in file /etc/sysconfig/suseconfig set CHECK_ETC_HOSTS="no" BEAUTIFY_ETC_HOSTS="no" - For using an NFS mounted filesystem for the Oracle data files add this line to /etc/fstab on all nodes. These particular values are the ones recommended and tested for Netapp NFS storage devices. nfs-storage.domain.com:/vol/oradata /var/opt/oracle nfs rw,fg,hard,nointr,rsize=32768,wsize=32768,tcp,noac,noatime,nfsvers=3,timeo=600 0 0 See "man nfs" for a description of NFS options. Please note that this is for Oracle DATA files only, for other files some of the options used are not very useful! For example, with "fg" we mount the NFS filesystem in the "foreground", so that the startup process of the machine will hang should the NFS server be down. This way an Oracle server will fully start only if the NFS storage is available instead of producing an error, it will hang and wait until the network storage is available! We use "nolock" because within the Oracle data files Oracle does its own locking already. Also note that for non-RAC (single-instance) Oracle databases you* must* use the "nolock" option, but for RAC you must *not* use it! Make sure the NFS server allows write access for "root" because the Oracle Cluster Manager runs as "root" and has to be able to access and write to a shared file. All nodes must have root access of course. Then run these four commands (as root): chkconfig nfslock on chkconfig nfs on rcnfslock start rcnfs start - setup the network interfaces (internal/external) - set up /etc/hosts and /etc/hosts.equiv (for rsh) - rcoracle start This sets the kernel parameters *before* we even start the installation! - edit /etc/inetd.conf (or use yast2 module for inetd), remove the "#" in front of "shell..." and "login...", one service is needed for "rsh", the other one for "rcp" (there are two lines for each, the one with the additional "-a" option does hostname verification via reverse lookup before accepting a connection) - as root, do "chkconfig inetd on" and "rcinetd on" (for immediate start) - Check if you can "rsh" and "rcp" - as user oracle - from any node to any other node in the cluster - Optional: install and configure xntpd to synchronize the time/date on all nodes example: my ntp.conf - edit /etc/profile.d/oracle[c]sh and set ORACLE_SID to some SID[#nodenumber] for each node NODE #1 (installation node) --------------------------- - as oracle ./runInstaller - Install cluster manager For quorum file enter this: /var/opt/oracle/oracm (or wherever you mounted the shared NFS filesystem) Exit the installer when you're done. If you select "Next install" to install the patchset (next point) right from there the installer will crash. - as oracle: ./runInstaller - change source to where you saved the 9.2.0.2 (or later) patchset and install the 920x patch for "Cluster Manager" The installation of the patchset is HIGHLY recommended since beginning with 9.2.0.2 you no longer use "watchdogd" but a kernel-module (written by Oracle and included in SuSE kernels) called hangcheck-timer, which has many big advantages over the old "watchdogd"! - edit /etc/sysconfig/oracle to enable start of OCM and GSD (GSD will work only later after the full software is installed) START_ORACLE_DB_OCM="yes" START_ORACLE_DB_GSD="yes" - Start/stop the oracm on one node - the very first time it starts it will create the shared quorum file. If started simultaneously on other nodes when this file doesn't exist yet creates conflicts and oracm starts only on some nodes, a question of random timing! Once this file exists there is no problem any more! rcoracle start rcoracle stop ALL NODES --------- - rcoracle start Starts OCM, and hangcheck-timer (called "iofence-timer" by oracm) If you didn't install our Oracle update kernel you will get an error about a missing module "iofence-timer"! - On each node: Check processes and $ORACLE_HOME/oracm.log/cm.log if oracm is up. Check /var/log/messages and cm.log if there are problems. The end of cm.log should look like this (here: 4 nodes): .... HandleUpdate(): SYNC(2) from node(0) completed {Thu Feb 13 18:20:19 2003 } HandleUpdate(): NODE(0) IS ACTIVE MEMBER OF CLUSTER {Thu Feb 13 18:20:19 2003 } HandleUpdate(): NODE(1) IS ACTIVE MEMBER OF CLUSTER {Thu Feb 13 18:20:19 2003 } HandleUpdate(): NODE(2) IS ACTIVE MEMBER OF CLUSTER {Thu Feb 13 18:20:19 2003 } HandleUpdate(): NODE(3) IS ACTIVE MEMBER OF CLUSTER {Thu Feb 13 18:20:19 2003 } NMEVENT_RECONFIG [00][00][00][00][00][00][00][0f] {Thu Feb 13 18:20:20 2003 } Successful reconfiguration, 4 active node(s) node 0 is the master, my node num is 0 (reconfig 3) NODE #1 (installation node) --------------------------- - as oracle: export SRVM_SHARED_CONFIG=/var/opt/oracle/SharedConfig - as oracle: ./runInstaller The installer will detect the running Oracle Cluster Manager and through it all nodes that are part of the cluster, and show them to you. Select ALL of the nodes to install the Oracle software on all of them! Select "software only", i.e. no database creation (we want to upgrade to the latest Oracle 9.2.0.x patchset first) Exit the installer. - as oracle: in $ORACLE_BASE/oui/bin/linux/, do: ln -s libclntsh.so.9.0 libclntsh.so - as oracle: runInstaller As source select the 920x patchset directory (./stage/products.jar) Install 920x patchset (we already patched the Cluster Manager earlier) - copy the installation node file /etc/oratab to the same location on all other nodes, making sure the owner (oracle:oinstall) remains the same! ALL NODES --------- - rcoracle stop - rcoracle start So that this time GSD is started too, which we only just installed GSD is needed by OEM and by dbca - Go to $ORACLE_BASE and create a link to the shared NFS mounted directory cd $ORACLE_BASE ln -s /var/opt/oracle oradata (assuming you mounted the NFS directory under /var/opt/oracle) - Go to $ORACLE_HOME and create a link to the shared NFS mounted directory cd $ORACLE_HOME rm -rf dbs ln -s /var/opt/oracle dbs (assuming you mounted the NFS directory under /var/opt/oracle) - The installer "forgot" all log directories on the other nodes when copying the software from the installation node: mkdir $ORACLE_HOME/rdbms/log mkdir $ORACLE_HOME/rdbms/audit mkdir $ORACLE_HOME/network/log mkdir $ORACLE_HOME/network/agent/log mkdir $ORACLE_HOME/ctx/log mkdir $ORACLE_HOME/hs/log mkdir $ORACLE_HOME/mgw/log mkdir $ORACLE_HOME/srvm/log mkdir $ORACLE_HOME/sqlplus/log mkdir $ORACLE_HOME/sysman/log FINISHED -------- Now the software is installed and ready, and the cluster manager and the GSD are up and running, we are ready to create a database! NODE #1 (installation node) --------------------------- - (as of orarun-1.8-8 this is done by "rcoracle start" when oracm is started) as root: touch /etc/rac_on A dirty Oracle hack, see last 5 lines of $ORACLE_HOME/bin/dbca for what it does if you are interested. Or: edit "dbca" and you don't need this "touch" command. - (as of orarun-1.8-8 this is done by /etc/profile.d/oracle.sh - for user oracle only) to run gsdctl or gsd or srvconfig you have to do this in the same shell: unset JAVA_BINDIR JAVA_HOME - as oracle: run "netca" to create an Oracle network configuration The Network Configuration Assistant should detect the running cluster manager and offer a cluster configuration option! You must at least configure a listener. You can accept all the defaults, i.e. simply press NEXT until the listener configuration is done. - run "lsnrctl start" on ALL NODES. - as oracle: dbca -datafileDestination $ORACLE_HOME/dbs Set up a database. Without the -datafileDestination parameter dbca assumes (and checks for!) raw devices which we don't use here! If there's an error right at the start, try restarting the cluster manager and GSD via "rcoracle stop; rcoracle start". - edit /etc/sysconfig/oracle to start additional services, e.g. the Oracle listener. If you set START_ORACLE_DB="yes" you have to edit /etc/oratab (on ALL NODES) and change the last letter in the line for your database (usually just one line, at the bottom) to "Y" or no database will be started. URL: http://www.suse.com/oracle/ Contact: oracle@suse.com