HOWTO: Start using the grid
A quickstart tutorial
    
      by Silvia D. Olabarriaga (silvia@science.uva.nl)
      edited by  Dennis H. van Dok (dennisvd@nikhef.nl)
    
    
    Introduction
    
      This HOWTO document outlines the necessary steps to get access the
      grid resources offered by the VL-e Proof-Of-Concept environment.
      Although it is particularly tailored for vlemed, it is generally
      applicable for all wannabee grid users.
      
    
    
      This is a step-by-step guide to get pole position on the grid.  You
      can follow this guide blindly, which will be the fastest way to get
      started, but if get confused and/or need more details, you can
      always select "Show more info" to get in-depth background
      information, explanations, and references to relevant websites.
    
    
      
	See? This text just became visible because you selected the
	above link. If you didn't see the link and didn't do anything,
	that's ok too—it just means that javascript support is
	disabled and all the extra information is visible by default.
      
      
	If javascript works, the link functions as a toggle. The
	purpose of all this is merely to keep the view from becoming
	cluttered by a lot of complex and potentially confusing
	information. So maybe you'd be better off without reading all
	the extras, in case you become even more befuddled!
      
      see also 
The PoC website.
      under Infrastructure:
      
Authentication
Central
facilities R0
     
    
      NOTE: some of the steps can take a few days, since they
      require manual processing.
    
    
      The steps outlined below should be followed in linear order,
      except steps 2a, 2b and 2c which can be done in parallel.
    
    
      - Obtain a grid certificate
 
      
	- Get an SRB account
 
	- Register with the VleMed organisation
 
	- Get an account at the SARA User 
	interface
 
      
       
      - Configure account at the UI
 
      - Run your first job on the grid.
 
    
    
    
      A grid certificate is a “passport”
      for using grid resources. 
    
    
    
      Are you sure you want to be reading this paragraph? I explained
      in the previous “extra info” block that it may do
      more harm than good. So unless you absolutely, definitely,
      really need to know everything about grid certificates you
      should probably skip this section.
    
    
      A grid certificate is a personal electronic document that
      testifies your true identity, much like a passport.
      Lets see how they compare.
    
    
      - 
	A passport is created for you and signed by the government. A
	grid certificate is created by yourself, but signed by the
	Certificate Authority (CA).
 
      - 
	A passport has your photograph in it so people can check that 
	you are the passport holder. A grid certificate has a
	private counterpart that is a cryptographic match for the
	public information.
      
 
      - 
	A passport is a single document. A certificate consists of
	two parts:
	
	  - a private part, that you should keep private at all times,
	  and
 
	  - a public part, that anyone can see and which has
	  your name written on it.
 
	
       
      - 
	You keep your passport safe at all times; the same goes for
	the private part of a certificate.
      
 
      - 
	It is hard to counterfeit a passport. It is easy to create a
	false certificate, but very hard to falsify the CA signature.
      
 
      - 
	A passport has a limited validity of several years. A grid
	certificate has no such limit, but the CA signature has a
	limited validity of about one year.
      
 
    
    
      Before the CA can sign your certificate, your true identity
      has to be checked. That means that you will have to
    
    
      - 
	fill out a paper form with your personal information
      
 
      - 
	personally meet with a Registration Authority (RA) subsidiary
	to the CA and show a copy of your passport or driver's
	license; the RA will sign the form
      
 
      - 
	Mail of fax the form and the copy of your ID to the CA.
      
 
    
    
      Identity checking and signing is done with mathematical sorcery
      called public
      key cryptography. The gist of it is that someone holding the
      public key of your certificate can challenge you to prove your
      identity by asking a question that only the holder of the
      private key is able to answer. To prevent identity
      theft, you should never hand over your private key to anyone,
      not even to the CA.
    
    
      Digital signing is the other way around: your certificate is
      signed vith the CA's private key. The CA's public key can be
      used to ‘decrypt’ the signature to verify that
    
    
      - 
	the signature matches your certificate and
      
 
      - 
	the signature was made by the CA.
      
 
    
    
      Since the CA's certificate is publicly available, anyone can
      check the validity of your certificate.
    
      
    
      Even more information:
    
      
     
    
      Get a grid certificate from the DutchGrid certificate
      authority, by going to http://www.dutchgrid.nl/ca/request/
      and filling out the web form:
    
    
	- Ask for a "Users (personal certificates)" certificate;
 
	- Choose 
	Organization: "vu", Unit: "vumc" 
	or 
	Organisation: "uva", Unit: "amc" 
	from the select boxes.
	 
	- 
	  Select Certification level: "Medium security".
	
 
    
    
      You'll will be asked to follow a sequence of steps including 
    
    
      -  
	download and run a script on your machine.
	
	  This script automates the creation of your key
	  and the certificate request to send to the CA. 
	
       
	  -  
	    filling in a paper form which you then need to have
	    signed by the  indicated person.
	    
	      Note that the CA will not sign the certificate until
	      this paper form is received. The person that should sign
	      the form is the Registration Authority (RA) subsidiary
	      to the CA.
	    
	   
	  -  
	      choosing a "pass phrase" (or password). 
	      Remember this well, since this is the password that
	      you'll have to type every single time that
	      you'll access grid resources (for example, to access
	      data or run jobs).
	    
	    
IMPORTANT: choose a strong passphrase!
	    
	    
	      
	      Strong passphrases consist of a combination of letters,
	      digits and other symbols and are at least 12 characters
	      long.  Avoid using common words.  Since spaces are
	      allowed, some example passphrases are:
	      
	      
!Doct0r Jone$$ likes 2 cut ### patients
X-raying 34% of the 20+ PEOPLE? (...)
	      
	      
	      but I'm sure you can think of something better. Just
	      mind that you should be able to memorize it, because
	      writing it down on a sticky note makes the whole thing
	      pointless.
	      
	     
	   
	
	
	  IMPORTANT:  your private key will be stored in the file
	  "userkey.pem". There are three golden rules:
	  
	
	
	  - 
	    DON'T lose this file, or your certificate becomes worthless.
	  
 
	  - 
	    DON'T forget your passphrase, for the same reason.
	  
 
	  - 
	    NEVER give this file to anyone, or make it readable for
	    anyone. You risk identity theft.
	  
 
	
 After your request has been processed, you'll have to "install" your
certificate in all computers that you use to access the grid. This
means that a directory called ".globus" will have to be copied into
your home in all the computers you'll be using to access grid
resources.
This directory contains essentially 2 files: 
  - userkey.pem
 
  -  the private key generated by the script above
 
  - usercert.pem
 
  -  file received by e-mail when your certificate is approved by the 
  Certification Authority (CA).
 
Important: the access properties of these files are very important, but sometimes they are affected by ftp. 
They should look like this:
$ ls -la ~/.globus/user*
-rw-r--r--    1 silvia   silvia       6659 Feb 15 11:53 usercert.pem
-r--------    1 silvia   silvia        963 Feb 13 14:37 userkey.pem
(which means that no one can read the private key file other than the owner.)
The best is to use zip or tar to copy the complete ".globus" directory into the computers you'll use.
see also
 
To get an SRB account,
send mail to grid.support@sara.nl.
  - indicate that you belong to VL-e Medical
 
  - add in the e-mail your Distinguished Name (DN) 
  
    The Distinguished Name is the certificate's "Subject:". It is the
    unique identifier by which you are known on the grid. For example:
    
      O=dutchgrid, O=users, O=nikhef, CN=Dennis van Dok
     
   
   
  The request will be processed manually and confirmed via e-mail.
  you'll get an e-mail containing your user name and a password.
This is necessary to associate your certificate to one "virtual
organization" (= group of people that have access to shared grid
resources).
To register, follow the instructions on
http://register.matrix.sara.nl/.
Indicate VO (virtual organization) = VleMedical.
  You'll get back an e-mail confirming your registration to the grid and VO.
  
    You may think that this step is superfluous after having gone
    through all the trouble getting a grid certificate. But you should
    realise that while a certificate helps to establish your identity,
    it does not give you the rights to use any resources. Those rights
    are usually handed down through virtual organisations, and that is
    why you need to register your affiliation.
  
  
    As part of this step you need to load your certificate into your browser.
  Here is another
  page explaining how.
 
  - 
    ui.matrix.sara.nl is the User Interface (UI) machine used access the
    grid resources
  
 
  - 
    from this machine, it is possible to run jobs on the clusters
    available to the VleMedical Virtual Organization
  
 
  - 
    send an e-mail to grid.support@sara.nl asking for an account at the
    "UI" machine
  
 
  - 
    you'll get back an e-mail containing your user name and a password.
    You can later change the password with a command-line utility
    (passwd) See also below "Configuring your account at the UI"
  
 
  - login at the Matrix UI ui.matrix.sara.nl
  
    - 
      use ssh -Y or ssh -X
      
	
	  The distinction is that since ssh 3.8, X forwarding
	  has become more secure; however, some applications can not
	  deal with this and crash. If you suffer from crashing X11
	  applications (possible with BadWindow error codes), use
	  -Y.
	
	
	  - 
	    You need an Xserver running on your machine, with
	    X-tunnelling enabled.
	  
 
	  - 
	    You need DISPLAY variable properly configured on your
	    local machine
	    
	      export DISPLAY=:0
	    
	   
	
       
     
  
   
  - 
    first change your password on the UI machine (utility passwd)
  
 
  - 
    now install your grid certificates:
    
      - 
	copy ".globus" directory from your desktop into your home (see
	above, at the end of step 1).
      
 
    
   
  
  - 
    install SRB configuration files: 
    
      - 
	Get and extract the srb-userenv.tar.gz file in your
	home directory. This will create a directory
	.srb.
      
 
      - 
	edit the file $HOME/.srb/.MdasEnv and replace
	all occurances of YOUR-SRB-USERNAME by your
	real SRB username.
      
 
    
   
  - 
    try out your certificate (create grid-proxy):
    
      - 
	run grid-proxy-init to create a grid proxy.
	you'll be asked to type your certificate passprase selected
	in step 1.
	
	  
	    A proxy is like a certificate, only shorter-lived. Also,
	    it is not signed by a CA, but by yourself with your
	    private key. That is why you need your
	    .globus directory on that machine. A proxy is
	    a little easier to carry around, and not so security
	    sensitive since it is only valid for 12 hours.
	  
	  
	    With proxies, you can let programs (i.e. grid jobs) act on
	    your behalf. This is necessary if, for instance, your grid
	    job needs to access other grid resources such as storage
	    elements.
	  
	  
	    You can run grid-proxy-info to see the status
	    of your proxy, and grid-proxy-destroy to remove
	    it from the system.
	  
	  
	    A proxy is just a file in the /tmp directory.
	    So it will remain on the system even when you logout.
	  
	 
       
    
   
  - 
    try out your SRB configuration:
    
      - 
	Create a grid proxy.
      
 
      - 
	Run Sinit.
      
 
      - 
	run Sls (this should show the list of files in your home
	directory at the SRB).
      
 
      - 
	Your can also change the password (Spasswd).
      
 
      - 
	this SRB session will remain open until you logout.
	You can close it with Sexit.
      
 
    
   
  
    This machine has all the environment necessary to access grid resources.
  
  
 
  - 
    Goal: run a job that writes "Hello vlemed user!" into a file.
  
 
  -  
    Example: 
    
      -  copy /home/silvia/examples/hello.jdl into your home directory:
      $ cd; cp /home/silvia/examples/hello.jdl .
      
 -  submit job to queue:
      edg-job-submit -r mu6.matrix.sara.nl:2119/jobmanager-pbs-express --vo vlemed hello.jdl
      
 -  this will display the "job identifier" (jobid), which will look like this:
      https://mu3.matrix.sara.nl:9000/tWoc6ZfjIwU-c0ifAWowAg
      This jobid should be used to check the job status and retrieve the generated files.
      
 -  get job status:
      $ edg-job-status https://mu3.matrix.sara.nl:9000/tWoc6ZfjIwU-c0ifAWowAg
      
 -  the possible status are:
      Ready (ready to be queued)
      Scheduled (waiting in one queue)
      Running
      Done
      Aborted
    
 
    
     -  once the job is "Done", it is possible to obtain the generated files:
    $ mkdir outputHello
    $ edg-job-get-output -dir outputHello https://mu3.matrix.sara.nl:9000/tWoc6ZfjIwU-c0ifAWowAg 
    The output files (std.err, std.out) will be stored in the given directory.
    
 -  to see the output:
    more outputHello/silvia_tWoc6ZfjIwU-c0ifAWowAg/std.out
    
 -  another example:
    $ cp /home/silvia/examples/getEnvironment.* .
    $ edg-job-submit -r mu6.matrix.sara.nl:2119/jobmanager-pbs-express --vo vlemed getEnvironment.jdl
    
 -  this will dump the environment on the computing node into the "std.out" file. To be used as an illustration only.
  
 
  General instructions for running jobs
  
    - 
      login at ui.matrix.sara.nl
    
 
    - 
      configure environment (.globus, .srb)
    
 
    - 
      create proxy (grid-proxy-init)
    
 
    - 
      jobs are started with command line utilities (edg-*) See
      chapter 3 in tutorial:
      http://www.dutchgrid.nl/Org/Nikhef/tutorial.pdf
    
 
    - 
      jobs go into a queue and are actually executed in nodes of one
      or more clusters that are available for the VlMedical VO
    
 
    - 
      available queues for VLe-Medical VO (only on matrix cluster, for now):
      
	- 
	  mu6.matrix.sara.nl:2119/jobmanager-pbs-express (for jobs
	  up to 10 minutes, should be used for debugging because
	  jobs are run immediately)
	
 
	- 
	  mu6.matrix.sara.nl:2119/jobmanager-pbs-short (for jobs up
	  to 4 hours)
	
 
	- 
	  mu6.matrix.sara.nl:2119/jobmanager-pbs-medium (for jobs up
	  to 24 hours) Note: jobs are aborted when they exceed the
	  queue maximum job time.
	
 
      
     
    - 
      to see status/load of matrix cluster, look at Ganglia tools
      
    
 
    - 
      to see system status (maintenance, etc)
      http://www.sara.nl/systemstatus/systemstatus_eng.php3
    
 
    - 
      Relevant commands:
      
	- edg-job-submit --vo vlemed <job.jdl>
 
	- edg-job-status <jobId>
 
	- edg-job-get-output <jobId>
 
	- edg-job-get-output -dir directory <jobId>
 
	- edg-job-cancel <job/home/silvia/examples/hello.jdlId>
	
 
      
     
  
 
Advanced
my proxy
template of JDL file (for Glue=PoC, time, etc)
rubjob.tz