Using MINOS Software at Oxford and RAL

Last modified: Fri Apr 2 11:22:58 BST 2010

See also MINOS on the GRID

SL4 Migration:

Oxford and RAL have migrated to SL4 (Scientific Linux 4) and the
information about SL3 has been stripped from this document.
See the old version of this page for that SL3 information.

The Oxford Setup

Machines

The Fortran code and OO codes run on linux (PPLXGENNG and the PPLXWN* batch farm).

Disks

We currently have one disk:-

The RAL Setup

This section describes the setup of the csf Linux farm at RAL (csf.rl.ac.uk) a.k.a RAL Tier 1

October 2007: The farm is migrating to Scientific Linux 4 (SL4) and the GRID!

  1. The SL3 PBS service is winding down and will close on 2 November 2007.

  2. The SL4 PBS service (queue = prod4) is the preferred service and will become the default on 2 November 2007. Currently there are no SL4 front-ends; all SL3 compiling has to be done as part of the batch jobs submitted to prod4.

  3. Use of the batch farm will be GRID-only (no PBS) as of January 2008.

Machines

The RAL Tier 1 is a farm of PCs running Scientific Linux 4. Both the Fortran and C++ code systems are installed. Access is via the GRID - there is no directly batch queue access except for VO maintenance.
For farm usage by experiment check: Tier 1/A statistics and data disk performance

Requests for MINOS allocation of farm resources are discussed at GridPP User Board Meetings and are determined primarily according to the number of UK authors.

At the meeting on Tuesday 24 June 2008 MINOS requested, and were granted:-

For the UB meeting on meeting on Monday 01 December 2008 Our MINOS resource needs for the WLCG pledge was:-

Year CPU (kSpecInt2k) Castor Disk (TB)NFS Disk (TB)Tape (TB)
2009 125 8.7 0.2 15
2010 125 8.7 0.2 20
2011 125 (Q1) 0 (Q2/4) 8.7 0.2 20
2012 0 8.7 0.2 20
2013 0 8.7 0.2 20

For the UB meeting in Spring 2009 we requested
Year CPU (kSpecInt2k) Castor Disk (TB)NFS Disk (TB)Tape (TB)
2009 250 (Q2) 125(Q3/4)8.7 1.0 15
Our increased request for NFS disk is to allow multi-step processing to cache intermediate results to disk. In Septmber 2009 our 2009/2010 quotas were set at
Year CPU (kSpecInt2k) Castor Disk (TB)NFS Disk (TB)Tape (TB)
2009 225 (Q4) 8.7 0.2 15
2010 125 8.7 0.2 15
In November 2009, as part of "GRIDPP4 Proposal for Non LHC and Emerging Experiments/Projects" we requested:-
Year CPU (kSpecInt2k) Disk (TB)Tape (TB)
2011 125 10 20
2012 125 10 20
2013 25 10 20
2014 25 10 20

Disks

See also RAL Tier-1 MINOS Farm Disk Accounting Metrics

and RAL Tier-1 Cluster Report

We currently have 2 NFS disks:-

New Snapshot and Frozen Base Releases are installed as required Nick West is the Software Librarian.

We also have Castor disk space:-

At RAL the tool DCM (Data Cache Manager) is used to copy and share FNAL data files on these disks. For example:-

  $MINOS_TOOLS/dcm.sh get C00110071_0000.mdaq.root C00110072_0000.mdaq.root
uses wget to copy the specified files and catalogues them. For more information see Data Cache Manager

Database

Nick West maintains the MINOS databases on the MySQL server:-
  sql.gridpp.rl.ac.uk
The names of the databases and user accounts follow the standard convention except that each is prefaced by minos_ e.g. minos_reader, minos_offline etc.

The database is not directly accessible from off-site but a connection can be made via an SSH tunnel. Proceed as follows:-

  1. Log into some local machine (call it my-proxy-machine) that is NOT a local MySQL server and type:-
        ssh -g -L 3306:sql.gridpp.rl.ac.uk:3306 user@csf.rl.ac.uk
    
    where "user" is some valid account at RAL. This logs you into RAL but also sets up a tunnel for the local port 3306 (MySQL) on my-proxy-machine that appears on csf and gets forwarded to the RAL MySQL server. The -L tells SSH to set up the tunnel and the -g says to make the tunnel global i.e. visible to other machines.

  2. Now with another session on my-proxy-machine or some other machine you can treat my-proxy-machine (the IP address must be fully qualified) as a proxy for the RAL machine. So you could:-
      setenv ENV_TSQL_URL mysql:odbc://my-proxy-machine.physics.my-site.ac.uk/minos_offline
      setenv ENV_TSQL_USER minos_reader
      setenv ENV_TSQL_PSWD=\0
    
Setting up a tunnel exposes the RAL database to additional risk if the proxy machine isn't properly managed. For this reason this procedure should not be followed unless the local mysql port on the proxy machine is behind the local site firewall.

GRID Migration

See also GRID UK Data Access Plans

We have now completed the process of migrating to the GRID with a few special cases which was based on MINOS input to UB Meeting on 04 Dec 2007

The key points for us are:-

GRID Migration FAQs

  1. How do I get to a UI (User Interface)?

    RAL was the popular choice but there are others at Tier 2 sites e.g. Oxford. Any site running Scientific Linux 4 should be able to become a UI by installing the gLite GRID middleware, probably with a bit of help from RAL. A good starting point is:-
      gLite 3.1 UI tarball distribution
    
  2. How do I get data from Castor?

    Once you have access to a UI you will still be able to read and write data either using lcg-utils or DCM. See:-
      Tutorial: Accessing Storage Elements
    
    For people outside the UK I (Nick) am prepared to help out with data movement back to FNAL so long as it doesn't take too much time.
  3. Can I use a Test Release?

    Yes you can; it's not without pain but is possible:-
      Installing a Test Release on a Worker Node
    

Preparation

At Oxford

For Oxford for Scientific Linux 4 (PPLXGENNG) software is being installed using RSD which can be used to install complete applications, typically minossoft (Frozen or Snapshot Releases), or C++/Fortran hybrid labyrinth/minossoft systems e.g. CedarDaikon.

At Oxford it is used all Frozen and Snapshot releases but the Development release is maintained using the classic SRT setup. To simplify setup a wrapper script is provide to source the appropriate script. For a csh/tcsh shell:-

  source /data/minos/software/setup_minos_oxford.csh {release|application}

Where release or application is an Installed Base Release
e.g. source /data/minos/software/setup_minos_oxford.csh development_sl4 source /data/minos/software/setup_minos_oxford.csh minossoft:S07-10-22-R1-26-build_1-SL4 source /data/minos/software/setup_minos_oxford.csh - the last of these just sets up support tools e.g. dcm, svn, SameWebClient.
For sh/bash use the same script replacing the extension .csh by .sh

By default, when setting up any minossoft based application you will get the unoptimised version. To subsequently switch to the optimised one type:-

  srt_setup SRT_QUAL=maxopt
or to return to the unoptimised one:-
  srt_setup SRT_QUAL=default

At RAL

For RAL SL4, software is being installed using RSD which can be used to install complete applications, typically minossoft (Frozen or Snapshot Releases), or C++/Fortran hybrid labyrinth/minossoft systems e.g. CedarDaikon.

For a csh/tcsh shell:-

  source /stage/minos-data1/software/grid/setup_minos_local-SL4.csh {release|application}

 Where
  <application>  Required application e.g. minossoft:S07-09-20-R1-26-build_1-SL4
  For a list see Installed Base Release at RAL - SL4

e.g.

  source /stage/minos-data1/software/grid/setup_minos_local-SL4.csh minossoft:S07-10-22-R1-26-build_1-SL4
  source /stage/minos-data1/software/grid/setup_minos_local-SL4.csh CedarDaikon:03-build_0-SL4
  source /stage/minos-data1/software/grid/setup_minos_local-SL4.csh  

  - the last of these just sets up support tools e.g. dcm, svn, SameWebClient.
For sh/bash use the same script replacing the extension .csh by .sh

By default, when setting up any minossoft based application you will get the unoptimised version. To subsequently switch to the optimised one type:-

  srt_setup SRT_QUAL=maxopt
or to return to the unoptimised one:-
  srt_setup SRT_QUAL=default

Test Releases on SL4 at RAL

As there are currently no SL4 front-ends at RAL, this does present a bit of a problem building Test Releases. There are two solutions, either use a script that Alex has developed that creates and submits a batch job to install and build a Test Release or split the process: creating the Test Release on the front-end but building on the back-end.
Using Alex's script

  1. Preparation
      cd /stage/minos-data1/allusers/tools/mtr
    
      export MTR_DIR=<my-working-directory>   (the parent directory that will hold the Test Release)
      export MTR_NOTIFY=<email-address>       (used to send confirmation email)
    
       - or the equivalent setenv commands if using a csh based shell.
    
    

  2. Execution
    Command syntax:-
    
      ./mtr <create|update|build> <test_release_name> \
         <base_release_name> <package1> <package2> (...) <packageN>
    
    Examples:-
    
      ./mtr create MyTest minossoft:S07-09-20-R1-26-build_1-SL4 NCUtils AnalysisNtuples
      ./mtr update MyTest CedarDaikon:03-build_0-SL4                (update and rebuild all existing packages)
      ./mtr update MyTest CedarDaikon:03-build_0-SL4 MCReweight     (as above but also add new ones)
      ./mtr build  MyTest CedarDaikon:03-build_0-SL4
    
Split operation
As we still have minossoot, and hence SRT, installed on the front-end its possible to split into two parts even if the same version of minossoft is not available on both:-

  1. Create the Test Release on the SL3 Front-end

    Create in the usual way using any available SL3 version of minossoft:-

      my_base_release_sl3=R1.25
      my_base_release_sl4=S07-09-20-R1-26
    
      my_test_release_parent=some-existing-directory
      my_test_release_name=some-test-release-name
      my_package=some-package
    
      source /stage/minos-data1/software/minossoft/setup/setup_minossoft_csf.sh $my_base_release_sl3
    
      cd $my_test_release_parent
      newrel -t $my_base_release_sl3 $my_test_release_name
      cd $my_test_release_name
      srt_setup -a
      addpkg -h $my_package
    
    Now comes the sneaky part, move the supporting release to the one that is needed on the SL4 back-end:-
      rm .base_release
      echo $my_base_release_sl4 > .base_release
    

  2. Build the release on the SL4 Back-end

    This works because SRT is smart enough to create the platform specific directories to hold binaries as and when it needs them. So even though the Test Release installed on SL3, it can still be built on SL4 - which is the whole point about SRT's ability to support multiple platforms sharing a single source tree.

      my_base_release_sl4=S07-09-20-R1-26
      my_application=minossoft:$my_base_release_sl4-build_1-SL4
      my_test_release_parent=some-existing-directory
      my_test_release_name=some-test-release-name
    
      source /rutherford/minos-soft2/tools/GridTools/Scripts/setup/setup_minos_lcg_grid.sh $my_application
    
      cd $my_test_release_parent/$my_test_release_name
      gmake all
    

Installed Base Releases

At RAL

ApplicationLast UpdatedMinossoft Version Neugen3 Version ROOT Version Notes
CedarDaikon04:071114-build_1-SL4 22 Nov 2007R1.24.3 v3_5_5 5-12-00f a.k.a. pro-SL4
DogwoodDaikon04:build_0-SL4 23 Jul 2009R2.0.1 v3_5_5 5-22-00a 23 Jul High momentum track bug fix
DogwoodDaikon07:build_1-SL4 23 Jul 2009R2.0.1 v3_5_5 5-22-00a 23 Jul Daikon 07 +High momentum track bug fix
minossoft:R1.28-build_1-SL4 23 Jan 2008R1.28 v3_5_5 5.18.00 None
minossoft:R1.29-build_1-SL4 21 May 2008R1.29 v3_5_5 5.18.00c None
minossoft:R1.30-build_1-SL4 01 Sep 2008R1.30 v3_5_5 5.20.00 None
minossoft:S09-10-16-R2-00-build_1-SL4 19 Oct 2009S09-10-16-R2-00 v3_5_5 5.24.00b a.k.a. old-SL4
minossoft:S10-01-19-R2-01-build_1-SL4 21 Jan 2010S10-01-19-R2-01 v3_5_5 5.26.00 a.k.a. new-SL4 and pro-SL4

To see applications are currently installed:-

  ls -l /stage/minos-data1/software/grid/apps
and then to see what versions of say minossoft are installed:-
  ls -l /stage/minos-data1/software/grid/apps/minossoft
and to see what libraries a given application contains type e.g.:-
  cat /stage/minos-data1/software/grid/apps/minossoft/S07-10-22-R1-26-build_2-SL4/installed_libraries
To get the application name from the directory names replace the intervening "/" with a ":" e.g.:-
  minossoft/S07-09-20-R1-26-build_1-SL4 -> minossoft:S07-09-20-R1-26-build_1-SL4
To see in detail exactly what any application consists of, look it up in the current build_config_table.dat

At Oxford

The Development Base Release, a few recent Frozen (validated) and the latest 3 Snapshot will eventually be maintained.

ApplicationLast UpdatedMinossoft Version Neugen3 Version ROOT Version Notes
CedarDaikon04:071114-build_1-SL4 15 Jan 2008R1.24.3 v3_5_5 5-12-00f a.k.a. pro-SL4
minossoft:R1.28-build_1-SL4 18 Jan 2008R1.28 v3_5_5 5.18.00 None
minossoft:R1.29-build_1-SL4 21 May 2008R1.29 v3_5_5 5.18.00c None
minossoft:R1.30-build_1-SL4 01 Sep 2008R1.30 v3_5_5 5.20.00 None
minossoft:R2.2-build_1-SL4 02 Mar 2010R2.2 v3_5_5 5.26.00b None
minossoft:R2.3-build_1-SL4 09 Apr 2010R2.3 v3_5_5 5.26.00b None
minossoft:S08-03-20-R1-28-build_1-SL4 25 Mar 2008S08-03-20-R1-28 v3_5_5 5.18.00a Held by Justin
minossoft:S09-10-16-R2-00-build_1-SL4 16 Oct 2009S09-10-16-R2-00 v3_5_5 5.24.00b a.k.a. old-SL4
minossoft:S10-01-19-R2-01-build_1-SL4 19 Jan 2010S10-01-19-R2-01 v3_5_5 5.26.00 a.k.a. pro-SL4
minossoft:S10-04-02-R2-03-build_1-SL4 02 Apr 2010S10-04-02-R2-03 v3_5_5 5.26.00b a.k.a. new-SL4
development Use for testing only! RecentlyHEAD v3_5_5 HEAD None

To check what applications are currently installed:-

  ls -l /data/minos/software/grid/apps
and then to see what versions of say minossoft are installed:-
  ls -l /data/minos/software/grid/apps/minossoft
and to see what libraries a given application contains type e.g.:-
  cat /stage/minos-data1/software/grid/apps/minossoft/S07-10-22-R1-26-build_2-SL4/installed_libraries
To get the application name from the directory names replace the intervening "/" with a ":" e.g.:-
  minossoft/S07-09-20-R1-26-build_1-SL4 -> minossoft:S07-09-20-R1-26-build_1-SL4
To see in detail exactly what any application consists of, look it up in the current build_config_table.dat

Soft Links to Snapshot Releases

At Oxford and RAL soft links are maintained to ease migration betwen Snapshot Releases. There are 3 soft links:- They should be O.K., if simply running Base Releases, but if building and running Test Releases use explicit Frozen or Snapshot versions.

For example, just to use the current production Snapshot:-

  Oxford: source /data/minos/software/setup_minos_oxford.sh/csh                  minossoft:pro-SL4
     RAL: source /stage/minos-data1/software/grid/setup_minos_local-SL4.sh/.csh  minossoft:pro-SL4

The Development Base Release

At Oxford a Development Release is updated daily and is built against ROOT updated to CVS HEAD. Its principle purpose is the early detection of problems within our code and between our code and ROOT. Consequently it is always changing.

The Development Release should not be used as the basis of production work.

Although it is constantly changes it is recommended for code development and should always work. In order to achieve this we have a double system. The basic idea is that there are two independent builds of minossoft called dev_1 and dev_2 and associated ROOT builds called root_cvs_1 and root_cvs_2. At any one time one version works at some level and is available for use. The other is updated and built and, should it appear to work, the system flips so that it becomes the current version. Flipping is achieved by soft links.

 $SRT_DIST/releases:-
    dev_1
    dev_2
    development -> dev_1 (or dev_2)
  
 $INSTALLATION/:-

    root_cvs_1
    root_cvs_2
    root_cvs -> root_cvs_1 (or root_cvs_2)
Updating and flipping works as follow:-

  1. At midnight the system updates ROOT and minossoft on the version of development that isn't current i.e. the one not pointed to by the soft links

  2. If the build and a simple loon test job succeed it prepares a script that will switch to the new release but does not execute it.

  3. A separate cron job runs at 8am and if it finds the release switching script, executes and deletes it.

  4. If the build is a long one or has its start delayed (it mostly runs on the farm) and finishes after 8 am, the switch won't take place that day even if the build is successful. However the presence of the switching script will disable the next nightly build so that the flip will take place at 8am on the following day.

  5. If the build fails no switching script is created, the soft link remains unchanged, and the next nightly update and rebuild works on the broken version.
If it works as advertised:-

Caution

When adding packages to a Test Release based on development, don't do e.g.:-
  addpkg XXX
as that will attempt to check out the currently selected HEAD_* release and will fail with a message of the type
Release development uses XXX version HEAD_2, will check that out
...
cvs [checkout aborted]: no such tag HEAD_2
Instead do:-
  addpkg XXX -h

Switching Versions

See Preparation

CVS

If you need read access to the FNAL code repository, for example if you want to build your own test release, then you also need the lines:-
  setenv CVSROOT :pserver:anonymous@minos1.fnal.gov:/cvs/minoscvs/rep1
  setenv CVS_RSH ssh
which tell cvs to user the anonymous server at minos1.fnal.gov, using SSH security. When you want to access the Repository you may have to login. If attempting to access results in a request to login then do so using:-
  cvs -d $CVSROOT login
and will be asked for a password - ask your Software Librarian if you don't know it. If you need both read and write access see Acquiring access to the MINOS CVS repository.

DCM - Data Cache Manager

DCM (Data Cache Manager) is a simple utility to manage a collection of data files spread over multiple disks and owned by multiple users. It is installed both at RAL and at Oxford. To see what disk space is available and where the rest has gone type:-
  $MINOS_TOOLS/dcm.sh disk_usage
The system can retrieve files from FNAl Dcache, e.g.:-
  $MINOS_TOOLS/dcm.sh get C00070590_0000.tdaq.root C00070590_0001.tdaq.root ...
and also read and write files to local SEs (Storage Elements) at RAL.

If moving many files to/from RAL its a good idea to log into one of the "Data Mover" front-ends:

   csfmove01.rl.ac.uk 
or csfmove02.rl.ac.uk
as they have higher bandwidth connectivity to the Internet.

For more details see Data Cache Manager

SAM - Sequential data Access with Metadata

The The SAM Web Services package is installed at Oxford and RAL and made available as part of the standard setup script. The examples shown in the above URL should run as shown e.g.:-
samTranslateDimensions \
   --dim="run_type physics% \
          and data_tier sntp-near \
          and physical_datastream_name cosmic \
          and start_time < to_date('2005-10-02','yyyy-mm-dd') \
          and end_time > to_date('2005-10-01','yyyy-mm-dd')" 
Notes about forming queries (the --dim="..." option above):- You can locate a file with:-
samLocate \
  --file=N00008695_0023.cosmic.sntp.R1_18.0.root 
and retrieve the complete metadata entry for the file with:-
samGetMetadata  \
   --file=F00032684_0023.mdaq.root
The script minosGetFiles.py is installed so to copy the files that satisfy a query into the current directory:-
minosGetFiles.py --dim=query
e.g.:-
minosGetFiles.py --dim="file_name = N00008695_002%.cosmic.sntp.R1_18.0.root"
although now that DCM uses SAM there should be no need to work directly with the SAM commands.
Contact: Nick West (n.west1@physics.ox.ac.uk)