Simple job submission with traditional JDL files
Job submission with glite-wms-job-submit is not very straightforward, as the generated job identifier has to be stored, and then the job output retrieval directory has to be resolved by this identifier. To save you from this typing exercise, the WMSX is capable of managing the output retrieval directory for simple jobs.
You can find a prepared example for this task here: simplejob.tar.gz. To submit the example job, type the following commands.
Get the tarball:
> tar -xzf simplejob.tar.gz
> rm -f simplejob.tar.gz
> cd simplejob
Log onto the Grid:
> glite-voms-proxy-init -voms your_vo
# If your job is expected to be long, also get long term authentication:
> myproxy-init -d -n
Submit the job:
> wmsx-provider.sh workdir
> wmsx-requestor.sh -vo your_vo
# If your job is expected to be long, also make the WMSX remember your grid password:
> wmsx-requestor.sh -remembergrid
# If your job is expected to be long and you are working on AFS, also make the WMSX remember your AFS password:
> wmsx-requestor.sh -rememberafs
> wmsx-requestor.sh -j simplejob.jdl -r workdir/out/resultdir
# Or, if JobType = "Interactive" was set in the JDL file, say instead:
> wmsx-requestor.sh -j simplejob.jdl -r workdir/out/resultdir -o workdir/out/StdOutFile
Your results shall be retrieved into the directory workdir/out/resultdir . If your job was interactive ( JobType = "Interactive" was set in the JDL file), the file containting StdOut / StdErr shall be workdir/out/StdOutFile , which shall be updated on the fly. Remark: if the JobType = "Interactive" was set, the fields StdInput , StdOutput and StdError cannot be set in the JDL file.
If you are finished, you can stop WMSX, and destroy your Grid authentications:
> wmsx-requestor.sh -k
# If myproxy-init was used, i.e. for long term jobs:
> myproxy-destroy -d
> glite-voms-proxy-destroy
An educative example for simple interactive job submission may be found here: interactive.tar.gz. This works in a similar way as the simplejob.tar.gz example.
Mass submission of independent jobs: parameter scan with short calculations
WMSX is a very convenient tool for mass submission of short, e.g. of about maximum 1 day lifetime, independent jobs. Possible application: parameter scan studies with relatively short calculations. (The job lifetime is limited by the typical allowed maximal job lifetimes, assigned by the Grid sites. A limit of 3 day running time is typical, so a maximum 1 day job is fine.) WMSX automatically manages output retrieval issues and the limiting of the number of concurrently running jobs. For this purpose, the jobs are described with a slightly extended JDL language, which is preprocessed by WMSX. If JobType = "Interactive" is set, the file containing StdOut / StdErr is updated on the fly, so you can see what your job is actually doing. A pre-execution script helps to prepare inputs, and a post-execution script helps to process outputs of the job.
You can find a prepared example for this task here: sample.tar.gz. To submit the example job, type the following commands.
Get the tarball:
> tar -xzf sample.tar.gz
> rm -f sample.tar.gz
> cd sample
Log onto the Grid:
> glite-voms-proxy-init -voms your_vo
# If your jobs are expected to be long, also get long term authentication:
> myproxy-init -d -n
Submit the jobs:
> wmsx-provider.sh workdir
> wmsx-requestor.sh -vo your_vo
# If your jobs are expected to be long, also make the WMSX remember your grid password:
> wmsx-requestor.sh -remembergrid
# If your jobs are expected to be long and you are working on AFS, also make the WMSX remember your AFS password:
> wmsx-requestor.sh -rememberafs
# If you want to submit many-many jobs, limit the number of concurrently running jobs e.g. to 100:
> wmsx-requestor.sh -n 100
> wmsx-requestor.sh -a arg.list
The file arg.list contains a list of jobs (together with its parameters) to be started. Your results shall be retrieved under the directory workdir/out . If your job was interactive ( JobType = "Interactive" was set in the JDL file), the file containting StdOut / StdErr shall be updated on the fly.
If you are finished, you can stop WMSX, and destroy your Grid authentications:
> wmsx-requestor.sh -k
# If myproxy-init was used, i.e. for long term jobs:
> myproxy-destroy -d
> glite-voms-proxy-destroy
Mass submission of chained jobs: parameter scan with long calculations
The mass submission has an important feature: each job can decide whether a launch of a further job is needed. This decision is done by the post-execution script: if it returns with exitcode 1, the chain script is invoked. The lines of the standard output of the chain script is interpreted by WMSX as lines in the arg.list file (see previous example), so new jobs are launched. In this way, jobs can be chained (and the chains can also fork, as a job can also launch multiple jobs). In this way, a parameter study with long term calculations can be performed. The long term calculations are preformed by the job chains. A good example is numerical solution of partial differential equations: multiple chains are launched (possibly with different initial conditions), and each chain is imitating a long term job (the solution of the partial differential equation with a given initial condition), split up into subsequent shorter term jobs.
You can find a prepared example for this task here: chainsample.tar.gz. To submit the example job, type the following commands. (Similar to the previous example.)
Get the tarball:
> tar -xzf chainsample.tar.gz
> rm -f chainsample.tar.gz
> cd chainsample
Log onto the Grid:
> glite-voms-proxy-init -voms your_vo
# If your jobs are expected to be long, also get long term authentication:
> myproxy-init -d -n
Submit the jobs:
> wmsx-provider.sh workdir
> wmsx-requestor.sh -vo your_vo
# If your jobs are expected to be long, also make the WMSX remember your grid password:
> wmsx-requestor.sh -remembergrid
# If your jobs are expected to be long and you are working on AFS, also make the WMSX remember your AFS password:
> wmsx-requestor.sh -rememberafs
# If you want to submit many-many jobs, limit the number of concurrently running jobs e.g. to 100:
> wmsx-requestor.sh -n 100
> wmsx-requestor.sh -a arg.list
The file arg.list contains a list of jobs (together with its parameters) to be started. Your results shall be retrieved under the directory workdir/out . If your job was interactive ( JobType = "Interactive" was set in the JDL file), the file containting StdOut / StdErr shall be updated on the fly.
If you are finished, you can stop WMSX, and destroy your Grid authentications:
> wmsx-requestor.sh -k
# If myproxy-init was used, i.e. for long term jobs:
> myproxy-destroy -d
> glite-voms-proxy-destroy
-- AndrasLaszlo - 10 Jul 2009
META FILEATTACHMENT |
attr="" autoattached="1" comment="Interactive example" date="1208360461" name="interactive.tar.gz" path="interactive.tar.gz" size="529" user="Main.AndrasLaszlo" version="1" |
META FILEATTACHMENT |
attr="" autoattached="1" comment="Simplejob example" date="1208360425" name="simplejob.tar.gz" path="simplejob.tar.gz" size="501" user="Main.AndrasLaszlo" version="1" |
META FILEATTACHMENT |
attr="" autoattached="1" comment="Sample example" date="1208360496" name="sample.tar.gz" path="sample.tar.gz" size="1340" user="Main.AndrasLaszlo" version="1" |
META FILEATTACHMENT |
attr="" autoattached="1" comment="Chainsample example" date="1208360521" name="chainsample.tar.gz" path="chainsample.tar.gz" size="1653" user="Main.AndrasLaszlo" version="1" |
|