HOWTO: Start using the grid
A quickstart tutorial
by Silvia D. Olabarriaga (silvia@science.uva.nl)
edited by Dennis H. van Dok (dennisvd@nikhef.nl)
Introduction
This HOWTO document outlines the necessary steps to get access the
grid resources offered by the VL-e Proof-Of-Concept environment.
Although it is particularly tailored for vlemed, it is generally
applicable for all wannabee grid users.
This is a step-by-step guide to get pole position on the grid. You
can follow this guide blindly, which will be the fastest way to get
started, but if get confused and/or need more details, you can
always select "Show more info" to get in-depth background
information, explanations, and references to relevant websites.
See? This text just became visible because you selected the
above link. If you didn't see the link and didn't do anything,
that's ok too—it just means that javascript support is
disabled and all the extra information is visible by default.
If javascript works, the link functions as a toggle. The
purpose of all this is merely to keep the view from becoming
cluttered by a lot of complex and potentially confusing
information. So maybe you'd be better off without reading all
the extras, in case you become even more befuddled!
see also
The PoC website.
under Infrastructure:
Authentication
Central
facilities R0
NOTE: some of the steps can take a few days, since they
require manual processing.
The steps outlined below should be followed in linear order,
except steps 2a, 2b and 2c which can be done in parallel.
- Obtain a grid certificate
- Get an SRB account
- Register with the VleMed organisation
- Get an account at the SARA User
interface
- Configure account at the UI
- Run your first job on the grid.
A grid certificate is a “passport”
for using grid resources.
Are you sure you want to be reading this paragraph? I explained
in the previous “extra info” block that it may do
more harm than good. So unless you absolutely, definitely,
really need to know everything about grid certificates you
should probably skip this section.
A grid certificate is a personal electronic document that
testifies your true identity, much like a passport.
Lets see how they compare.
-
A passport is created for you and signed by the government. A
grid certificate is created by yourself, but signed by the
Certificate Authority (CA).
-
A passport has your photograph in it so people can check that
you are the passport holder. A grid certificate has a
private counterpart that is a cryptographic match for the
public information.
-
A passport is a single document. A certificate consists of
two parts:
- a private part, that you should keep private at all times,
and
- a public part, that anyone can see and which has
your name written on it.
-
You keep your passport safe at all times; the same goes for
the private part of a certificate.
-
It is hard to counterfeit a passport. It is easy to create a
false certificate, but very hard to falsify the CA signature.
-
A passport has a limited validity of several years. A grid
certificate has no such limit, but the CA signature has a
limited validity of about one year.
Before the CA can sign your certificate, your true identity
has to be checked. That means that you will have to
-
fill out a paper form with your personal information
-
personally meet with a Registration Authority (RA) subsidiary
to the CA and show a copy of your passport or driver's
license; the RA will sign the form
-
Mail of fax the form and the copy of your ID to the CA.
Identity checking and signing is done with mathematical sorcery
called public
key cryptography. The gist of it is that someone holding the
public key of your certificate can challenge you to prove your
identity by asking a question that only the holder of the
private key is able to answer. To prevent identity
theft, you should never hand over your private key to anyone,
not even to the CA.
Digital signing is the other way around: your certificate is
signed vith the CA's private key. The CA's public key can be
used to ‘decrypt’ the signature to verify that
-
the signature matches your certificate and
-
the signature was made by the CA.
Since the CA's certificate is publicly available, anyone can
check the validity of your certificate.
Even more information:
Get a grid certificate from the DutchGrid certificate
authority, by going to http://www.dutchgrid.nl/ca/request/
and filling out the web form:
- Ask for a "Users (personal certificates)" certificate;
- Choose
Organization: "vu", Unit: "vumc"
or
Organisation: "uva", Unit: "amc"
from the select boxes.
-
Select Certification level: "Medium security".
You'll will be asked to follow a sequence of steps including
-
download and run a script on your machine.
This script automates the creation of your key
and the certificate request to send to the CA.
-
filling in a paper form which you then need to have
signed by the indicated person.
Note that the CA will not sign the certificate until
this paper form is received. The person that should sign
the form is the Registration Authority (RA) subsidiary
to the CA.
-
choosing a "pass phrase" (or password).
Remember this well, since this is the password that
you'll have to type every single time that
you'll access grid resources (for example, to access
data or run jobs).
IMPORTANT: choose a strong passphrase!
Strong passphrases consist of a combination of letters,
digits and other symbols and are at least 12 characters
long. Avoid using common words. Since spaces are
allowed, some example passphrases are:
!Doct0r Jone$$ likes 2 cut ### patients
X-raying 34% of the 20+ PEOPLE? (...)
but I'm sure you can think of something better. Just
mind that you should be able to memorize it, because
writing it down on a sticky note makes the whole thing
pointless.
IMPORTANT: your private key will be stored in the file
"userkey.pem". There are three golden rules:
-
DON'T lose this file, or your certificate becomes worthless.
-
DON'T forget your passphrase, for the same reason.
-
NEVER give this file to anyone, or make it readable for
anyone. You risk identity theft.
After your request has been processed, you'll have to "install" your
certificate in all computers that you use to access the grid. This
means that a directory called ".globus" will have to be copied into
your home in all the computers you'll be using to access grid
resources.
This directory contains essentially 2 files:
- userkey.pem
- the private key generated by the script above
- usercert.pem
- file received by e-mail when your certificate is approved by the
Certification Authority (CA).
Important: the access properties of these files are very important, but sometimes they are affected by ftp.
They should look like this:
$ ls -la ~/.globus/user*
-rw-r--r-- 1 silvia silvia 6659 Feb 15 11:53 usercert.pem
-r-------- 1 silvia silvia 963 Feb 13 14:37 userkey.pem
(which means that no one can read the private key file other than the owner.)
The best is to use zip or tar to copy the complete ".globus" directory into the computers you'll use.
see also
To get an SRB account,
send mail to grid.support@sara.nl.
- indicate that you belong to VL-e Medical
- add in the e-mail your Distinguished Name (DN)
The Distinguished Name is the certificate's "Subject:". It is the
unique identifier by which you are known on the grid. For example:
O=dutchgrid, O=users, O=nikhef, CN=Dennis van Dok
The request will be processed manually and confirmed via e-mail.
you'll get an e-mail containing your user name and a password.
This is necessary to associate your certificate to one "virtual
organization" (= group of people that have access to shared grid
resources).
To register, follow the instructions on
http://register.matrix.sara.nl/.
Indicate VO (virtual organization) = VleMedical.
You'll get back an e-mail confirming your registration to the grid and VO.
You may think that this step is superfluous after having gone
through all the trouble getting a grid certificate. But you should
realise that while a certificate helps to establish your identity,
it does not give you the rights to use any resources. Those rights
are usually handed down through virtual organisations, and that is
why you need to register your affiliation.
As part of this step you need to load your certificate into your browser.
Here is another
page explaining how.
-
ui.matrix.sara.nl is the User Interface (UI) machine used access the
grid resources
-
from this machine, it is possible to run jobs on the clusters
available to the VleMedical Virtual Organization
-
send an e-mail to grid.support@sara.nl asking for an account at the
"UI" machine
-
you'll get back an e-mail containing your user name and a password.
You can later change the password with a command-line utility
(passwd) See also below "Configuring your account at the UI"
- login at the Matrix UI ui.matrix.sara.nl
-
use ssh -Y or ssh -X
The distinction is that since ssh 3.8, X forwarding
has become more secure; however, some applications can not
deal with this and crash. If you suffer from crashing X11
applications (possible with BadWindow error codes), use
-Y.
-
You need an Xserver running on your machine, with
X-tunnelling enabled.
-
You need DISPLAY variable properly configured on your
local machine
export DISPLAY=:0
-
first change your password on the UI machine (utility passwd)
-
now install your grid certificates:
-
copy ".globus" directory from your desktop into your home (see
above, at the end of step 1).
-
install SRB configuration files:
-
Get and extract the srb-userenv.tar.gz file in your
home directory. This will create a directory
.srb.
-
edit the file $HOME/.srb/.MdasEnv and replace
all occurances of YOUR-SRB-USERNAME by your
real SRB username.
-
try out your certificate (create grid-proxy):
-
run grid-proxy-init to create a grid proxy.
you'll be asked to type your certificate passprase selected
in step 1.
A proxy is like a certificate, only shorter-lived. Also,
it is not signed by a CA, but by yourself with your
private key. That is why you need your
.globus directory on that machine. A proxy is
a little easier to carry around, and not so security
sensitive since it is only valid for 12 hours.
With proxies, you can let programs (i.e. grid jobs) act on
your behalf. This is necessary if, for instance, your grid
job needs to access other grid resources such as storage
elements.
You can run grid-proxy-info to see the status
of your proxy, and grid-proxy-destroy to remove
it from the system.
A proxy is just a file in the /tmp directory.
So it will remain on the system even when you logout.
-
try out your SRB configuration:
-
Create a grid proxy.
-
Run Sinit.
-
run Sls (this should show the list of files in your home
directory at the SRB).
-
Your can also change the password (Spasswd).
-
this SRB session will remain open until you logout.
You can close it with Sexit.
This machine has all the environment necessary to access grid resources.
-
Goal: run a job that writes "Hello vlemed user!" into a file.
-
Example:
- copy /home/silvia/examples/hello.jdl into your home directory:
$ cd; cp /home/silvia/examples/hello.jdl .
- submit job to queue:
edg-job-submit -r mu6.matrix.sara.nl:2119/jobmanager-pbs-express --vo vlemed hello.jdl
- this will display the "job identifier" (jobid), which will look like this:
https://mu3.matrix.sara.nl:9000/tWoc6ZfjIwU-c0ifAWowAg
This jobid should be used to check the job status and retrieve the generated files.
- get job status:
$ edg-job-status https://mu3.matrix.sara.nl:9000/tWoc6ZfjIwU-c0ifAWowAg
- the possible status are:
Ready (ready to be queued)
Scheduled (waiting in one queue)
Running
Done
Aborted
- once the job is "Done", it is possible to obtain the generated files:
$ mkdir outputHello
$ edg-job-get-output -dir outputHello https://mu3.matrix.sara.nl:9000/tWoc6ZfjIwU-c0ifAWowAg
The output files (std.err, std.out) will be stored in the given directory.
- to see the output:
more outputHello/silvia_tWoc6ZfjIwU-c0ifAWowAg/std.out
- another example:
$ cp /home/silvia/examples/getEnvironment.* .
$ edg-job-submit -r mu6.matrix.sara.nl:2119/jobmanager-pbs-express --vo vlemed getEnvironment.jdl
- this will dump the environment on the computing node into the "std.out" file. To be used as an illustration only.
General instructions for running jobs
-
login at ui.matrix.sara.nl
-
configure environment (.globus, .srb)
-
create proxy (grid-proxy-init)
-
jobs are started with command line utilities (edg-*) See
chapter 3 in tutorial:
http://www.dutchgrid.nl/Org/Nikhef/tutorial.pdf
-
jobs go into a queue and are actually executed in nodes of one
or more clusters that are available for the VlMedical VO
-
available queues for VLe-Medical VO (only on matrix cluster, for now):
-
mu6.matrix.sara.nl:2119/jobmanager-pbs-express (for jobs
up to 10 minutes, should be used for debugging because
jobs are run immediately)
-
mu6.matrix.sara.nl:2119/jobmanager-pbs-short (for jobs up
to 4 hours)
-
mu6.matrix.sara.nl:2119/jobmanager-pbs-medium (for jobs up
to 24 hours) Note: jobs are aborted when they exceed the
queue maximum job time.
-
to see status/load of matrix cluster, look at Ganglia tools
-
to see system status (maintenance, etc)
http://www.sara.nl/systemstatus/systemstatus_eng.php3
-
Relevant commands:
- edg-job-submit --vo vlemed <job.jdl>
- edg-job-status <jobId>
- edg-job-get-output <jobId>
- edg-job-get-output -dir directory <jobId>
- edg-job-cancel <job/home/silvia/examples/hello.jdlId>
Advanced
my proxy
template of JDL file (for Glue=PoC, time, etc)
rubjob.tz