Monday, February 21, 2011

Building Mellanox OFED 1.5.2 for Rocks 5.4

Here are my notes from Rocks 5.4 and Mellanox OFED 1.5.2

Perform the build steps on a compute node. That way if the build process, run as root, has a bug, we don't risk having to rebuild the head node.

The MLNX_OFED-1.5.2 comes with modules for kernel 2.6.18-194.el5, we are using 2.6.18-194.17.1.el5, so we need to build new kernel modules.

1. Download the ISO file MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5.iso from this page

2. Ensure that the build system is running the correct kernel

# uname -r

2.6.18-194.17.1.el5

3. Mount the ISO and copy the contents to a scratch work area

# mount -t iso9660 -o loop /root/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5.iso /mnt/cdrom 
# mkdir /root/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5
# cp -r /mnt/cdrom/* /root/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5/
# umount /mnt/cdrom
# rm /root/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5.iso

4. Install some dependencies

# yum -y install libtool tcl-devel libstdc++-devel mkisofs gcc-c++ rpm-build

5. Uninstall some RPM files that will fail to uninstall during the ISO build

# yum remove \*openmpi\*

6. Build the new ISO file

# cd /root/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5

# ./docs/mlnx_add_kernel_support.sh -i /root/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5.iso
Note: This program will create MLNX_OFED_LINUX ISO for rhel5.5 under /tmp directory.
      All Mellanox, OEM, OFED, or Distribution IB packages will be removed.
Do you want to continue?[y/N]:y
Building OFED RPMs...
Removing OFED RPMs...
Running mkisofs...
Created /tmp/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5.iso

# mkdir /share/apps/mellanox/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5
# mv /tmp/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5.iso /share/apps/mellanox/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5.iso

7. Copy the new files from the iso to the NFS share

# mount -t iso9660 -o loop /share/apps/mellanox/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5.iso /mnt/cdrom
# rsync -a /mnt/cdrom/ /share/apps/mellanox/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5/

# umount /mnt/cdrom

8. List the new kernel modules

# cd /share/apps/mellanox/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5
# find . -name kernel-* | grep 194.17
./x86_64/kernel-ib-1.5.2-2.6.18_194.17.1.el5.x86_64.rpm
./x86_64/kernel-mft-2.6.2-2.6.18_194.17.1.el5.x86_64.rpm
./x86_64/kernel-ib-devel-1.5.2-2.6.18_194.17.1.el5.x86_64.rpm

9. Test the installer on one of the compute nodes

# cd /share/apps/mellanox/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5
# ./mlnxofedinstall --force --hpc

This will automatically update the firmware on the HCA.

10. This OFED can be installed on the compute nodes by adding this section to extend-compute.xml (note, I normally put other driver updates into this 'post-98-installdrivers' script). Also notice the yum install, the MLNX OFED install will remove any package containing 'openmpi' in the package name, this line reinstalls said packages


<file name="/etc/rc.d/rocksconfig.d/post-98-installdrivers" perms="0755">
#!/bin/sh

# Install Mellanox
if [ "$(/sbin/lspci | grep -i connectx)" != "" ] ; then
  /usr/bin/yum -y remove openmpi\* rocks-openmpi\*
  /share/apps/mellanox/MLNX_OFED_LINUX-1.5.2-2.0.0-rhel5.5-2.6.18-194.17.1.el5/mlnxofedinstall --hpc --force

  /sbin/chkconfig --add openibd
  /sbin/chkconfig openibd on
  /sbin/service openibd start
fi

/usr/bin/yum -y install my-custom-openmpi my-custom-application-openmpi

/bin/mv /etc/rc.d/rocksconfig.d/post-98-installdrivers /root/post-98-installdrivers

# Reboot one final time
/sbin/shutdown -r now

</file>

Adding Infiniband over IP to Rocks

20120611 - Based on a question to the Rocks mailing list, I'm adding this section to explain how to enable TCP/IP over Inifiniband via Rocks. This process should add the IP addresses to the Rocks managed DNS / hosts. The IP addresses of my compute-0-x nodes start at 254 and work backwards, so that's what I used for the IB ip addresses: First add the new network, calling it 'infiniband', or whatever name you'd like
# rocks add network infiniband subnet=192.168.3.0 netmask=255.255.255.0
# ip=254 && for node in {1..16}; do
   rocks add host interface compute-0-${node} ib0 \
     ip=192.168.3.${ip} subnet=ib-cheaha ;
   let ip=${ip}-1 ;
done
Repeat for the next set of nodes
# ip=238 && for node in {1..16}; do
   rocks add host interface compute-1-${node} ib0 \
     ip=192.168.3.${ip} subnet=ib-cheaha ;
   let ip=${ip}-1 ;
done
And so on... Change the sshd_config on the compute nodes to not use DNS. I have found that ssh to compute nodes take close to a minute when this is set to true
# rocks set attr ssh_use_dns false
Synchronize the configuration

# rocks sync config
Now open the firewall the ib0 for all ports and protocols
# rocks open appliance firewall compute \
   network=infiniband service="all" protocol="all"

# rocks sync host firewall compute

# rocks list host firewall compute-0-1
SERVICE PROTOCOL CHAIN ACTION NETWORK   OUTPUT-NETWORK FLAGS                                COMMENT SOURCE
ssh     tcp      INPUT ACCEPT public     -------------- -m state --state NEW                 ------- G     
all     all      INPUT ACCEPT public     -------------- -m state --state RELATED,ESTABLISHED ------- G     
all     all      INPUT ACCEPT infiniband -------------- ------------------------------------ ------- A     
all     all      INPUT ACCEPT private    -------------- ------------------------------------ ------- G     
Hope this helps

Building Mellanox OFED 1.4 for Rocks 5.3

Here are my notes from building Mellanox OFED 1.4 on a Rocks 5.3 x86_64 cluster utilizing CentOS 5.4 and kernel 2.6.18-128-7.1:

1. Download the ISO file MLNX_OFED_LINUX-1.4-rhel5.3.iso from this page

The MLNX_OFED-1.4 comes with modules for kernel 2.6.18-128, we are using 2.6.18-128-7.1, so we need to build new modules.

2. Mount the ISO and copy the contents to a scratch work area

# mount -t iso9660 -o loop /root/MLNX_OFED_LINUX-1.4-rhel5.3.iso /mnt/cdrom 
# mkdir /root/MLNX_OFED-1.4
# cp -r /mnt/cdrom/* /root/MLNX_OFED_LINUX-1.4/
# umount /mnt/cdrom

3. Edit the script so that it will work with CentOS (our centos-release says 5.4, we are still running a 5.3 kernel), this is the script that will build a new ISO file

# cd /root/MLNX_OFED-1.4

TabularUnifieddocs/mlnx_add_kernel_support.sh
Index: docs/mlnx_add_kernel_support.sh
===================================================================
--- docs/mlnx_add_kernel_support.sh.orig 2009-12-17 15:51:46.000000000 -0600
+++ docs/mlnx_add_kernel_support.sh 2009-12-17 15:52:00.000000000 -0600
@@ -279,7 +279,7 @@
         redhat-release-5Server-5.2.0.4)
         distro="rhel5.2"
         ;;
-        redhat-release-5Server-5.3.0.3)
+        redhat-release-5Server-5.3.0.3 | centos-release-5-4.el5.centos.1 )
         distro="rhel5.3"
         ;;
         sles-release-10-15.2)
4. Install some dependencies

# yum -y install libtool tcl-devel libstdc++-devel mkisofs gcc-c++

5. Uninstall some RPM files that will fail to uninstall during the ISO build

/bin/rpm --nodeps -e --allmatches openmpi-libs-1.3.2-2.el5 \
 openmpi-devel-1.3.2-2.el5 rocks-openmpi-1.3.2-1 openmpi-libs-1.3.2-2.el5 \
 openmpi-devel-1.3.2-2.el5 openmpi-1.3.2-2.el5 openmpi-1.3.2-2.el5 \
 openmpi-gnu-1.3.3-1.el5.uabeng

6. Build the new ISO file

# cd /root/MLNX_OFED_LINUX-1.4
# ./docs/mlnx_add_kernel_support.sh -i /root/MLNX_OFED_LINUX-1.4-rhel5.3.iso

Note: This program will create MLNX_OFED_LINUX ISO for rhel5.3 under /tmp directory.
      All Mellanox, OEM, OFED, or Distribution IB packages will be removed.
Do you want to continue?[y/N]:y
Building OFED RPMs...
Removing OFED RPMs...
Running mkisofs...
Created /tmp/MLNX_OFED_LINUX-1.4-rhel5.3.iso

# mv /tmp/MLNX_OFED_LINUX-1.4-rhel5.3.iso /share/apps/mellanox/MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1.iso

7. Copy the new files from the iso to the NFS share

# mount -t iso9660 -o loop /share/apps/mellanox/MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1.iso /mnt/cdrom
# cp -r /mnt/cdrom /share/apps/mellanox/MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1

# cd /share/apps/mellanox
# find ./MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1 -name kernel-* | grep x86
./MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1/x86_64/kernel-ib-1.4-2.6.18_128.7.1.el5.x86_64.rpm
./MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1/x86_64/kernel-ib-1.4-2.6.18_128.el5.x86_64.rpm
./MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1/x86_64/kernel-ib-devel-1.4-2.6.18_128.el5.x86_64.rpm
./MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1/x86_64/kernel-ib-devel-1.4-2.6.18_128.7.1.el5.x86_64.rpm

8. Test the installer on the compute node

# cd /share/apps/mellanox/MLNX_OFED_LINUX-1.4-rhel5.3-kernel-2.6.18_128.7.1
# ./mlnxofedinstall --hpc

This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, or Distribution IB packages will be removed. 
Do you want to continue?[y/N]:y

Uninstalling the previous version of OFED 

Starting MLNX_OFED_LINUX-1.4 installation ... 

Installing mpi-selector RPM 
Preparing...                ########################################### [100%]
   1:mpi-selector           ########################################### [100%]
Installing kernel-ib RPM 
Preparing...                ########################################### [100%]
   1:kernel-ib              ########################################### [100%]
Installing ib-bonding RPM 
Preparing...                ########################################### [100%]
   1:ib-bonding             ########################################### [100%]
Installing mft RPM 
Preparing...                ########################################### [100%]
   1:mft                    ########################################### [100%]
Install user level RPMs: 
Preparing...                ########################################### [100%]
   1:libibverbs             ########################################### [  2%]
   2:libibcommon            ########################################### [  4%]
   3:libibumad              ########################################### [  6%]
   4:opensm-libs            ########################################### [  8%]
   5:librdmacm              ########################################### [ 10%]
   6:openmpi_intel          ########################################### [ 12%]
   7:libibmad               ########################################### [ 14%]
   8:infiniband-diags       ########################################### [ 16%]
   9:openmpi_gcc            ########################################### [ 18%]
  10:mpitests_openmpi_gcc   ########################################### [ 20%]
  11:mpitests_openmpi_pgi   ########################################### [ 22%]
  12:mpitests_openmpi_intel ########################################### [ 24%]
  13:qperf                  ########################################### [ 26%]
  14:perftest               ########################################### [ 28%]
  15:ibutils                ########################################### [ 30%]
  16:libmthca               ########################################### [ 32%]
  17:libmlx4                ########################################### [ 34%]
  18:openmpi_pgi            ########################################### [ 36%]
  19:mstflint               ########################################### [ 38%]
  20:mlnxofed-docs          ########################################### [ 40%]
  21:ofed-scripts           ########################################### [ 42%]
  22:libibverbs             ########################################### [ 44%]
  23:libibcommon            ########################################### [ 46%]
  24:libibumad              ########################################### [ 48%]
  25:mvapich_intel          ########################################### [ 50%]
  26:opensm-libs            ########################################### [ 52%]
  27:librdmacm              ########################################### [ 54%]
  28:libibcommon-devel      ########################################### [ 56%]
  29:libibumad-devel        ########################################### [ 58%]
  30:libibverbs-devel       ########################################### [ 60%]
  31:librdmacm-utils        ########################################### [ 62%]
  32:opensm                 ########################################### [ 64%]
  33:mvapich_gcc            ########################################### [ 66%]
  34:mpitests_mvapich_gcc   ########################################### [ 68%]
  35:mpitests_mvapich_pgi   ########################################### [ 70%]
  36:mpitests_mvapich_intel ########################################### [ 72%]
  37:mvapich_pgi            ########################################### [ 74%]
  38:libibverbs-utils       ########################################### [ 76%]
  39:librdmacm-devel        ########################################### [ 78%]
  40:librdmacm-devel        ########################################### [ 80%]
  41:opensm-devel           ########################################### [ 82%]
  42:opensm-devel           ########################################### [ 84%]
  43:libibumad-devel        ########################################### [ 86%]
  44:libibcommon-devel      ########################################### [ 88%]
  45:libibverbs-devel       ########################################### [ 90%]
  46:libibmad               ########################################### [ 92%]
  47:libmthca               ########################################### [ 94%]
  48:libmlx4                ########################################### [ 96%]
  49:libibmad-devel         ########################################### [ 98%]
  50:libibmad-devel         ########################################### [100%]
Device (15b3:673c):
        0c:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0)
        Link Width: 8x
        Link Speed: 2.5Gb/s


Installation finished successfully. 

The firmware version 2.6.0 is up to date. 
Note: To force firmware update use '--force-fw-update' flag.
Configuring /etc/security/limits.conf. 
warning: /etc/infiniband/openib.conf saved as /etc/infiniband/openib.conf.rpmsave