DCM names SE elements using the following syntax:-
<site>-<type>-<service> e.g. ral_t1-castor-test_d0t1 Where:- <site> Site name e.g. ral_t1 (RAL Tier 1) or fnal <type> The storage technology e.g. dcache or castor <service> The individual service e.g.tapeWhen DCM is runs its initial output includes a list of the SEs it can access. For example:-
Local Storage Elements (in search order):- ral_t1_ui-nfs Local NFS Disks ral_t1-castor-prod_d0t1 RAL T1 CASTOR disk0tape1 Production Service ral_t1-castor-test_d0t1 RAL T1 CASTOR disk0tape1 Test Service ral_t1-dcache-disk RAL T1 dCache Disk Store ral_t1-dcache-tape RAL T1 dCache Tape Store fnal-dcache-enstore FNAL dCache interface to EnstoreThe order reflects the default search order used to retrieve files. Note how the local disk is treated as an SE.
DCM uses the SE names as the basis for a "DCM URL". The syntax is:-
dcm://<SE_name>/<SE_dir>/<File_name<#>byte_size> e.g. dcm://fnal-dcache-enstore/pnfs/fs/usr/minos/reco_near/R1_18/snts_data/2005-04/N00007148_0008.spill.snts.R1_18.0.root#129262When requested to transfer files, DCM first converts the files names into URLs which are then used to determine the appropriate commands to perform the operation.
dcm_cache/
which is where DCM will place files, although it can also manage files
that users have placed elsewhere on these disks.On the first disk in the list there must also be a top-level directory:-
dcm_catalogue/
this is the "soft links catalogue" which is where DCM places soft
links to data files on all the disks it manages. That directory has
the sub-directory
DCM/
where DCM maintains its text catalogues which also holds
history.log
that records when files are retrieved from SEs
When DCM is run it starts by listing the disks it is managing. For example:-
DCM configuration:-
List of DCM-managed disks: /stage/minos-data1/d3
/stage/minos-data1/d4
/stage/minos-data1/d5
/stage/minos-data1/d6
/stage/minos-data1/d7
Ownership group: minos
Scratch directory /tmp/dcm_scratch_area_13645
$MINOS_TOOLS/dcm.sh {global options} command {command options} {command args}
--debug n Switch on debug level n (=0 off) --expt e Selected experiment. Allowed values: minos [default] and sno --site s Select site. CAUTION: use for testing only!!
catalogue {<file>...<file>...} { --all}
Example: catalogue /stage/minos-data1/d4/C00080277_0000.mdaq.rootThis adds the file into both in the text catalogue and and as a soft link to the file in the soft links catalogue:-
dcm_catalogue/
directory which must be the top level directory on the first disk
managed by DCM. This command is useful if adding a file that is
not within the set of directories managed by DCM
Example: catalogue -allThis uses the results of the last disk scan (see the survey command) and checks that all the data files that it found are in the text and soft links catalogues.
directory_ownership {mode}
where
mode (optional):-
"full" [default] show every directory
"compress" suppress sub-directory wholly owned by a single user
This command uses the results of the last disk scan (see the survey command) and reports, for each
data directory, the users who own files in it including sub
directories.
DCM classifies all files into 1 of 4 types:-
get {command options} file-query file-query ...
Transfer one or more files from an SE (Storage Element).
--accept_dcm_url Return files as DCM URLs; doesn't attempt any transfers
--accept_root_url Return files as ROOT URLs if supported; otherwise transfer.
--demand_complete_set
Quit without getting any files unless able to get them all
Default: return whatever files can be located.
--file_list f If command succeeds, record list of files (or URLs) in file f.
Will include all files i.e. even those already on disk.
Caution: On input f must not exist.
--force_local Force a copy to local dir (see --local_dir) unless already
there.
--local_dir d Copy files to specified directory.
Default: the dcm_cache directory of the disk with most space
--max_files n Set upper limit on number of files to transfer.
Default 10. Hard upper limit of 1000 files.
Used to prevent misplaced wildcard from transferring
huge amounts of data!
--num_get_jobs n Run up to n transfer jobs at once.
Default 1. Hard upper limit of 10 jobs.
--remote_se se_name{/se_dir}
Only copy files from selected SE {and within selected /se_dir}
e.g --remote_se ral_t1-castor-test_d0t1/gnumi/v19/fluka05_le010z185i
Only look in SE ral_t1-castor-test_d0t1 within directory gnumi/v19/fluka05_le010z185i
e.g --remote_se 'ral_t1-dcache-disk/gnumi/v19/fluka05_le010z185i/job1.*'
Only look in SE ral_t1-dcache-disk within directory sub-tree gnumi/v19/fluka05_le010z185i/job1.*
Note: . - any single char; .* - any char string
--test Determine what files have to be transferred and from where but
don't transfer files
file-query
Either: File name
e.g. F00030574_0002.mdaq.root
or an 'egrep' wildcard regular expression: 'F000256.*.cand.R1.14.root'
Note: . - any single char; .* - any char string
CAUTION: Once match found in any SE DCM quits searching.
Or: A database query for SAM enclosed in square brackets
e.g. [ file_name like N00008695_002%.cosmic.sntp.R1_18.0.root ]
e.g. [ "run_type physics%
and data_tier sntp-near
and physical_datastream_name spill%
and start_time < to_date('2006-02-18','yyyy-mm-dd')
and end_time > to_date('2006-02-17','yyyy-mm-dd')
and version cedar" ]
Make sure there is a space after the leading '[' or the shell
command parser may treats as a wildcard construction.
Enclose in double quotes if query includes parentheses.
Or A DCM URL e.g. dcm://fnal-dcache-enstore/pnfs/fs/usr/minos/rec ... .snts.R1_18.0.root#129234
All 3 type types of command arg may be mixed in the same invocation.
DCM first executes all SAM commands to resolve them into files names.
Then, for file names that are not already a DCM URLs it searches the
SE catalogues and converts then to DCM URLs. It then transfers any
that it locates that are not already on local disk.Note that the 2 stage approach allows users to have a dataset defined by a SAM query and yet retrieve files from the closest SE.
Note that, for a given file-query, DCM stops searching SE catalogues as soon as it finds any match. The logic is that a dataset should always be defined by applying a search to a single SE and not by the logical OR of all SEs. So if you want to copy some data set, say a group of files matching a wildcard, and some are already on the local disk, then, by default DCM will only find them and not copy the rest. The solution is to use the --remote_se option to force DCM to look at the SE which has the full set; it will still check the local disk so there is no risk that it will copy files it already has.
If using the --file_list option be sure that the name of the file you pass is unique. The normal way to do that is to include the process ID (environmental variable $$) in the file name. Otherwise on a system with multiple jobs running all getting files via DCM there is a danger that two might use the same name to return their file list. As an additional precaution, DCM will reject the command if it is passed an pre-existing file.
The --accept_dcm_url can be useful to see what files would satisfy a request without doing any transfer. Using the --test option only shows you what files would have to be transferred, unlike the URL request which will show files on local disks as well. It also allows you to see if transfers would have to take place. The resultant URLs can later be passed to DCM for transfer, so long as they are still valid. This might be useful if running a job on a Worker Node if no catalogue were available.
put {command options} file_name file_name ...
Transfer one or more files to an SE (Storage Element).
--create_remote_dir
If necessary create remote directory
--file_list f If command succeeds, record list of files transferred
Each line of file is:-
Either: Name of file successfully written
Or: Error message starting with the character '?'
Caution: On input f must not exist.
--local_dir d Copy files from specified directory. Default: current directory
--overwrite Overwrite existing file. Default don't overwrite
--remote_se se_name/se_dir
Directory on SE. Compulsory
--test Just test, don't transfer files
file-name File name relative to --local_dir.
No wild-cards permitted and no check that file is recognisable as a data file.
survey {<se>...<se>...}
Example: survey ral_t1-castor-test_d0t1 fnal-dcache-enstoreThis command rebuilds the catalogues for the selected SEs or from all available SEs if none is specified. The resulting catalogue is stored in
dcm_catalogue/DCM/<SE name>.catFor most SEs the scan is carried out using the appropriate commands for the SE concerned, but there are two special cases:-
DCM treats the local set of disks as the best SE and gives it the name
<site-name>-nfsDCM does a recursive scan of all the directories that it manages and stores it in the catalogue
dcm_catalogue/DCM/<site-name>-nfs.catThis catalogue is used as the basis for the following commands:-
catalogue (when given the --all option) directory_ownership disk_usageOnce the scan is complete the survey command then executes:-
catalogue --all disk_usage
DCM doesn't scan Enstore. Instead it copies the latest version of a such a scan:-
http://www-stken.fnal.gov/enstore/tape_inventory/$FNAL::COMPLETE_FILE_LISTING_minosand then converts it into a DCM catalogue.
test <sub-command> <arg> ...Is used to test and debug DCM. Typing the test command without further arguments will list what tests are currently available.
catalogue <file>{. <file>..}
Example: uncatalogue /stage/minos-data1/d4/C00080277_0000.mdaq.rootThis removes the file from both in a disk based catalogue and and as a soft link to the file in the soft links catalogue:-
dcm_catalogue/
directory which must be the top level directory on the first disk
managed by DCM.
config/subdirectory.
This file identifies all the SEs used by the experiment, the services each provides and the way to access these services.
This file specifies which of the experiments SEs can be accessed from the local site and which interfaces to use to them.
This file specifies the local disk setup at the site.
For example, for the server "ral_t1-castor-prod_d0t1;rfio"
/castor/ads.rl.ac.uk/prod/grid/hep/disk0tape1/minos;
export STAGE_SVCCLASS=minosDisk0Tape1 export STAGE_HOST=castorstager.ads.rl.ac.uk export RFIO_USE_CASTOR_V2=YES
Handling of DCM URL , which encodes the SE name, SE directory and file size), is done by sei_dcm_url_pack
SE directory creation is done by sei_prepare_directory and file overwriting is done by sei_prepare_file
Catalogue handling is provided by sei_survey that can scan an SE and build a text catalogue and searching such a catalogue for a file name and hence infer the DCM URL (which encodes the SE name, SE directory and file size) is done by sei_search_catalogue
/pnfs/fs/usr/minos /pnfs/minos /pnfs/fnal.gov/usr/minosin such cases sei_search_catalogue takes
/pnfs/minosover other forms.
Having the transfer as a separate script allows FRS, when transferring multiple files, to run multiple jobs in parallel.
After a successful transfer FRS updates the SEI catalogues.
dcm/minos dcm/snoapart from
init_minos.pm init_sno.pmAfter parsing any global switches DCM knows which experiment it is dealing with and then executes the appropriate experiment initialisation.
Calls from the generic to the experiment specific code constitute the experiment API.
Parameters:-
==========
$file_name Name of file to be identified (can contain directory)
Return:-
======
$data_name MINOS: Currently this is returned as the component
between the sub-run and the data type
SNO: The module name e.g.Reconstruct
$data_type MINOS: Data type i.e. the extension e.g. mdaq.root
SNO: Data type e.g. sno_root
$detector MINOS: The detector.One of "CalDet", "Far" or Near".
SNO: The phase e.g. salt
$run_no Run number
$sub_run_no Sub-run number (or -1 if n/a)
$version MINOS: Release (or "" if n/a)
SNO: Pass number (or "" if n/a)
Parameters:-
==========
Either: $file_name File name whose access info is required.
Or: $db_query A database query for SAM (MINOS) or Ral (SNO)
Return:-
======
A list file_access_size variables: Each consisting of:-
$file_name:$access_info:$estimated_file_size
where:-
$file_name File name.
$access_info MINOS: ENSTORE directory
SNO: Tape name:file number
$estimated_file_size Estimated size in GB
In the case of an error a single entry is returned: "? Error message"