Wednesday, March 23, 2011

Install Luster Monitoring Tool (LMT) on CentOS 5.5

In this article, I document the steps to build and install LMT and it's dependency Cerebro. The configuration of Cerebro can get pretty complex, in this example we make it simple to focus on LMT.

I don't cover MySQL configuration yet but plan to do so in the near future.

Cerebro and LMT Build Instructions

The build and install OS are CentOS 5.5 x86_64 systems.
  1. Download and build cerebro (http://sourceforge.net/projects/cerebro/files/cerebro/) on your favorite build machine (make sure to set up your ~/rpmbuild directory structure and your ~/.rpmmacros file).
    • Download the latest source code (1.12 at this time) to ~/rpmbuild/SOURCES/
    • Download the src.rpm (I found it under the version 1.10 tree) and extract
      
      $ mkdir ~/sources/cerebro
      $ cd ~/sources/cerebro
      $ rpm2cpio cerebro-1.10-1.src.rpm | cpio -idvm
      $ mv cerebro.spec ~/rpmbuild/SPECS/
      
      
    • Modify the cerebro.spec file as follows for version 1.12 (unified diff format)
      
      --- cerebro.spec 2010-04-07 16:17:35.000000000 -0500
      +++ cerebro.spec.new 2011-03-23 14:25:02.654373643 -0500
      @@ -1,12 +1,12 @@
       Name:    cerebro 
      -Version: 1.10
      +Version: 1.12
       Release: 1
       
       Summary: Cerebro cluster monitoring tools and libraries
       Group: System Environment/Base
       License: GPL
      -Source: cerebro-1.10.tar.gz
      -BuildRoot: %{_tmppath}/cerebro-1.10
      +Source: cerebro-1.12.tar.gz
      +BuildRoot: %{_tmppath}/cerebro-1.12
       
       %description
       Cerebro is a collection of cluster monitoring tools and libraries.
      @@ -90,7 +90,7 @@
       Event module to monitor node up/down.
       
       %prep
      -%setup  -q -n cerebro-1.10
      +%setup  -q -n cerebro-1.12
       
       %build
       %configure --program-prefix=%{?_program_prefix:%{_program_prefix}} \
      @@ -157,6 +157,7 @@
       %defattr(-,root,root)
       %doc README NEWS ChangeLog DISCLAIMER DISCLAIMER.UC COPYING
       %config(noreplace) %{_sysconfdir}/init.d/cerebrod
      +%config(noreplace) %{_sysconfdir}/cerebro.conf
       %{_includedir}/*
       %dir %{_libdir}/cerebro
       %{_libdir}/libcerebro*
      
    • Before building the rpm I had to comment out the %_vendor string in my .rpmmacros file, otherwise the configure kept adding the vendor to the --target switch
    • Build the rpm, this will build several rpms, for Lustre Monitoring Tool all we need is the cerebro package
      
      $ rpmbuild -ba --sign ~/rpmbuild/SPECS/cerebro.spec
      
      
    • Look at the package info
      
      $ rpm -qpi ~/rpmbuild/RPMS/x86_64/cerebro-1.12-1.x86_64.rpm 
      Name        : cerebro                      Relocations: (not relocatable)
      Version     : 1.12                              Vendor: (none)
      Release     : 1                             Build Date: Wed 23 Mar 2011 02:12:09 PM CDT
      Install Date: (not installed)               Build Host: buildhost01
      Group       : System Environment/Base       Source RPM: cerebro-1.12-1.src.rpm
      Size        : 1039859                          License: GPL
      Signature   : DSA/SHA1, Wed 23 Mar 2011 02:12:09 PM CDT, Key ID xxxx
      Summary     : Cerebro cluster monitoring tools and libraries
      Description :
      Cerebro is a collection of cluster monitoring tools and libraries.
      
      
    • Take a look at the contents of the rpm
      
      $ rpm -qpl ~/rpmbuild/RPMS/x86_64/cerebro-1.12-1.x86_64.rpm 
      /etc/cerebro.conf
      /etc/init.d/cerebrod
      /usr/include/cerebro
      /usr/include/cerebro.h
      ...
      
  2. LMT RPM build
    • Temporarily install cerebro to satisfy the build requirement
      
      $ sudo rpm -Uvh ~/rpmbuild/RPMS/x86_64/cerebro-1.12-1.x86_64.rp
      
      
    • Install lua-devel package from Epel
      
      $ sudo yum install lua-devel
      
      =============================================================================================
       Package                Arch                Version                  Repository         Size
      =============================================================================================
      Installing:
       lua-devel              i386                5.1.4-4.el5              epel               18 k
       lua-devel              x86_64              5.1.4-4.el5              epel               18 k
      Installing for dependencies:
       lua                    i386                5.1.4-4.el5              epel              228 k
       lua                    x86_64              5.1.4-4.el5              epel              229 k
      
      
    • Download the lmt src rpm
      
      $ mkdir ~/sources/lmt
      $ cd ~/sources/lmt
      $ wget http://lmt.googlecode.com/files/lmt-3.1.2-1.src.rpm
      
      $ rpmbuild --rebuild --sign lmt-3.1.2-1.src.rpm
      
      
      $ ls -l ~/rpmbuild/RPMS/x86_64/lmt-*
      lmt-3.1.2-1.el5.myrepo.x86_64.rpm
      lmt-server-3.1.2-1.el5.myrepo.x86_64.rpm
      lmt-server-agent-3.1.2-1.el5.myrepo.x86_64.rpm
      
      
    • LMT-GUI RPM build
    • Install the prerequisite java-devel
      
      $ sudo yum install java-devel
      
      =======================================================================================================
       Package                        Arch         Version                       Repository             Size
      =======================================================================================================
      Installing:
       java-1.6.0-openjdk-devel       x86_64       1:1.6.0.0-1.16.b17.el5        centos5-updates        12 M
      
      Transaction Summary
      =======================================================================================================
      
    • Download the lmt-gui src rpm and build
      
      $ mkdir ~/sources/lmt-gui
      $ cd ~/sources/lmt-gui
      $ wget http://lmt.googlecode.com/files/lmt-gui-3.0.0-1.src.rpm
      
      $ rpmbuild --rebuild --sign lmt-gui-3.0.0-1.src.rpm 
      
      
      
      $ rpm -qpi ~/rpmbuild/RPMS/x86_64/lmt-gui-3.0.0-1.el5.myrepo.x86_64.rpm 
      Name        : lmt-gui                      Relocations: (not relocatable)
      Version     : 3.0.0                             Vendor: (none)
      Release     : 1.el5.myrepo                 Build Date: Wed 23 Mar 2011 02:44:25 PM CDT
      Install Date: (not installed)               Build Host: build01
      Group       : Applications/System           Source RPM: lmt-gui-3.0.0-1.el5.myrepo.src.rpm
      Size        : 2347300                          License: GPL
      Signature   : DSA/SHA1, Wed 23 Mar 2011 02:44:25 PM CDT, Key ID xxxx
      Packager    : Jim Garlick 
      URL         : http://code.google.com/p/lmt
      Summary     : Lustre Montitoring Tools Client
      Description :
      Lustre Monitoring Tools (LMT) GUI Client
      
      
    • Next I copy the RPMs to our local repository
      
      $ cd ~/rpmbuild/RPMS/x86_64/
      $ cp -a lmt-* cerebro-1.12-1.x86_64.rpm /share/repo/mirror/myrepo/el5/x86_64/RPMS/
      
      $ cd ../../SRPMS
      $ cp -a cerebro-* /share/repo/mirror/myrepo/el5/SRPMS/
      $ cd ~/sources
      $ cp -a lmt/lmt-3.1.2-1.src.rpm lmt-gui/lmt-gui-3.0.0-1.src.rpm /share/repo/mirror/myrepo/el5/SRPMS/
      
    • Rebuild the repodata for the repository
      
      $ createrepo /share/repo/mirror/myrepo/el5/x86_64/
      

Cerebro and LMT Install Instructions

  1. Install cerebro and lmt-server-agent on the mds's and oss's
    
    $ for n in mds-{0..1} oss-{0..2}; do ssh root@lustre-$n yum install -y cerebro lmt-server-agent ; done
    
  2. Install cerebro and lmt-server on the management server
    
    $ ssh root@management-server yum -y install cerebro lmt-server
    
  3. Modify the /etc/cerebro.conf file to look like this (by default the entire file is comments, append this to the end)
    • On the Lustre servers
      
      cerebro_metric_server 192.168.0.10
      cerebro_event_server 192.168.0.10
      cerebrod_heartbeat_frequency 10 20
      cerebrod_speak on
      cerebrod_speak_message_config 192.168.0.10
      cerebrod_listen off
      
    • On the management server
      
      cerebrod_heartbeat_frequency 10 20
      cerebrod_speak on
      cerebrod_speak_message_config 192.168.0.10
      cerebrod_listen on
      cerebrod_listen_message_config 192.168.0.10
      cerebrod_metric_controller on
      cerebro_metric_server 192.168.0.10
      cerebrod_event_server on
      cerebro_event_server 192.168.0.10
      
  4. Configure the daemon to start on the servers and management server
    
    $ for n in mds-{0..1} oss-{0..2}; do ssh root@lustre-$n "/sbin/chkconfig cerebrod on && /sbin/service cerebrod start" ; done
    
    $ ssh root@managment-server "/sbin/chkconfig cerebrod on && /sbin/service cerebrod start"
    
    
  5. Login to the management server and verify that the server see's all of the servers (this can be run from any of the servers, not just the management server)
    
    $ /usr/sbin/cerebro-stat -m updown_state
    
    MODULE DIR = /usr/lib64/cerebro
    mgmt-srv: 1
    lustre-mds-0: 1
    lustre-mds-1: 1
    lustre-oss-0: 1
    lustre-oss-1: 1
    lustre-oss-2: 1
    
  6. Now run the -l switch to see the available metrics (lmt_mdt, lmt_ost and lmt_osc are added by the lmt-server package)
    
    $ /usr/sbin/cerebro-stat -l
    
    MODULE DIR = /usr/lib64/cerebro
    metric_names
    cluster_nodes
    lmt_mdt
    updown_state
    lmt_ost
    lmt_osc
    
  7. Run the ltop (will default to the first Lustre file system found unless otherwise specified) command on the management node to view a toplike output for OSTs
    
    $ ltop
    
    Filesystem: lustre
        Inodes:    209.344m total,     77.286m used ( 37%),    132.057m free
         Space:     42.978t total,     15.931t used ( 37%),     27.047t free
       Bytes/s:  0.000g read,       0.000g write,                 1 IOPS
       MDops/s:  4 open,        2 close,     285 getattr, 0 setattr
                     0 link,        0 unlink,      0 mkdir,         0 rmdir
                     1 statfs, 5 rename,      0 getxattr
    >OST S        OSS   Exp   CR rMB/s wMB/s  IOPS   LOCKS  LGR  LCR %cpu %mem %spc
    0000 F stre-oss-0   131    0     0     0     0  515290   87    0    0  100   41
    0001 F stre-oss-0   131    0     0     0     0  528633  106    0    0  100   41
    0002 F stre-oss-1   131    0     0     0     0  509573   16    0    0  100   35
    0003 F stre-oss-1   131    0     0     0     0  518495   21    0    0  100   36
    0004 F stre-oss-2   131    0     0     0     0  533299   49    0    0  100   34
    0005 F stre-oss-2   131    0     0     0     0  527621   61    0    0  100   35
    

1 comment:

Anonymous said...

Your detailed instructions are great. Could you clarify the ip address 192.168.0.10 on both management server and the lustre servers are pointing to the management node? For some reasons, when I ran ltop on the management server it did not return any file system information even though the cerebro-stat command sees all lustre nodes. Does the lustre file system need to be mounted on the management server node? Thanks, Ray