{"id":42,"date":"2015-07-21T11:02:31","date_gmt":"2015-07-21T10:02:31","guid":{"rendered":"https:\/\/portal.supercomputing.wales\/?page_id=42"},"modified":"2016-01-28T17:32:24","modified_gmt":"2016-01-28T17:32:24","slug":"interactive-use-job-arrays","status":"publish","type":"page","link":"https:\/\/portal.supercomputing.wales\/index.php\/index\/slurm\/interactive-use-job-arrays\/","title":{"rendered":"Advanced Use: Interactive, X Forwarding, Job Arrays, Task Geometry, Parallel Batch Submission"},"content":{"rendered":"<h3>Interactive Use<\/h3>\n<p>In order to use an HPC system interactively &#8211; i.e. whilst sat in front of the terminal interacting live with allocated resources &#8211; there is a simple two stage process in Slurm.<\/p>\n<p>Firstly, we must create an allocation &#8211; that is, an allocation of a certain amount of resources that we specify we need. This is done using the <em>salloc<\/em> command, like this:<br \/>\n<pre class=\"preserve-code-formatting\">[test.user@cstl001 imb]$ salloc -n 8 --ntasks-per-node=1\n salloc: Granted job allocation 134<\/pre><br \/>\nNow that an allocation has been granted, we have access to those specified resources. Note that the resource specification we made in this case is exactly as the parameters passed for batch use was &#8211; so in this case we have asked for 8 tasks (processes) with them distributed at one per node.<\/p>\n<p>Now that we are &#8216;inside&#8217; an allocation, we can use the <em>srun<\/em> command to execute against the allocated resources, for example:<br \/>\n<pre class=\"preserve-code-formatting\">[test.user@cstl001 imb]$ srun hostname\ncst004\ncst003\ncst002\ncst008\ncst006\ncst007\ncst001\ncst005<\/pre><br \/>\nThe above output shows how, by default, <em>srun<\/em> executes a command on all allocated processors. Arguments can be passed to <em>srun<\/em> to operate differently, for example:<\/p>\n<p>We could also launch an MPI job here if we wished. We would load the software <em>modules<\/em> as we do in a batch script and call <em>mpirun<\/em> in the same way. This can be useful during code debugging.<\/p>\n<p>It is also possible to use <em>srun<\/em> to launch an interactive shell process for some heavy processing on a compute node, for example:<br \/>\n<pre class=\"preserve-code-formatting\">srun -n 2 --pty bash<\/pre><br \/>\nThis would move us to a shell on a compute node.<\/p>\n<h5><\/h5>\n<h3>X11 Forwarding Interactively<\/h3>\n<p>Once a resource allocation is granted as per the above, we can use <em>srun<\/em> to provide X11 graphical forwarding all the way from the compute nodes to our desktop using <em>srun &#8211;x11 &lt;application&gt;.<\/em><\/p>\n<p>For example, to run an X terminal:<br \/>\n<pre class=\"preserve-code-formatting\">srun --x11 xterm<\/pre><br \/>\nNote that the user must have X11 forwarded to the login node for this to work &#8211; this can be checked by running <em>xclock<\/em> at the command line.<\/p>\n<p>Additionally, the <em>&#8211;x11<\/em> argument can be augmented in this fashion <em>&#8211;x11=[batch|first|last|all]<\/em> to the following effects:<\/p>\n<ul>\n<li><em>&#8211;x11=first <\/em>This is the default, and provides X11 forwarding to the first compute hosts allocated.<\/li>\n<li><em>&#8211;x11=last<\/em> This provides X11 forwarding to the last of the compute hosts allocated.<\/li>\n<li><em>&#8211;x11=all<\/em> This provides X11 forwarding from all allocated compute hosts, which can be quite resource heavy and is an extremely rare use-case.<\/li>\n<li><em>&#8211;x11=batch<\/em> This supports use in a batch job submission, and will provide X11 forwarding to the first node allocated to a batch job. The user must leave open the X11 forwarded login node session where they submitted the job.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3>Job Arrays<\/h3>\n<h4>Submission<\/h4>\n<p>Job arrays operate in Slurm much as they do in other batch systems. They enable a potentially huge number of similar jobs to be launched very quickly and simply, with the value of a runtime-assigned <em>array id<\/em> then being used to cause each particular job iteration to vary slightly what it does. Array jobs are declared using the <em>&#8211;array<\/em> argument to <em>sbatch<\/em>, which can (as with all arguments to <em>sbatch<\/em>) be inside as job script as an <em>#SBATCH<\/em> declaration or passed as a direct argument to <em>sbatch<\/em>. There are a number of ways to declare:<br \/>\n<pre class=\"preserve-code-formatting\">[test.user@cstl001 hello_world]$ sbatch --array=0-64 sbatch_sub.sh<\/pre><br \/>\n&#8230;declares an array with iteration indexes from 0 to 64.<br \/>\n<pre class=\"preserve-code-formatting\">[test.user@cstl001 hello_world]$ sbatch --array=0,4,8,12 sbatch_sub.sh<\/pre><br \/>\n&#8230;declares an array with iteration indexes specifically identified as 0, 4, 8 and 12.<br \/>\n<pre class=\"preserve-code-formatting\">[test.user@cstl001 hello_world]$ sbatch --array=0-12:3 sbatch_sub.sh<\/pre><br \/>\n&#8230;declares an array with iteration indexes from 0 to 12 with a stepping of 3, i.e. 0,3,6,9,12<\/p>\n<h4>Monitoring<\/h4>\n<p>When a job array is running, the output of <em>squeue<\/em> shows the parent task and the currently running iteration indexes:<br \/>\n<pre class=\"preserve-code-formatting\">[test.user@cstl001 hello_world]$ squeue\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 JOBID PARTITION\u00a0\u00a0\u00a0\u00a0 NAME\u00a0\u00a0\u00a0\u00a0 USER ST\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 TIME\u00a0 NODES NODELIST(REASON)\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_[6-64]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use PD\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:00\u00a0\u00a0\u00a0\u00a0\u00a0 4 (Resources)\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:00\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[005-008]\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_5\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:00\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[005-008]\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:03\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[001-004]\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:03\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[001-004]\n[test.user@cstl001 hello_world]$ squeue\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 JOBID PARTITION\u00a0\u00a0\u00a0\u00a0 NAME\u00a0\u00a0\u00a0\u00a0 USER ST\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 TIME\u00a0 NODES NODELIST(REASON)\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_[15-64]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use PD\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:00\u00a0\u00a0\u00a0\u00a0\u00a0 4 (Resources)\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_14\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:00\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[001-004]\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_10\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:02\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[005-008]\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_11\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:02\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[005-008]\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 143_1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 all\u00a0\u00a0\u00a0 hello test.use\u00a0 R\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0:07\u00a0\u00a0\u00a0\u00a0\u00a0 4 cst[001-004]<\/pre><\/p>\n<h4>IDs and Variables<\/h4>\n<p>Each iteration in an array assumes its own job ID in Slurm. However, Slurm creates two new environment variables that can be used in the script in addition to <em>SLURM_JOB_ID<\/em> storing the particular iteration&#8217;s job ID.<\/p>\n<p><em>SLURM_ARRAY_JOB_ID<\/em> stores the value of the parent job submission &#8211; i.e. the ID reported in the output from <em>sbatch<\/em> when submitted.<\/p>\n<p><em>SLURM_ARRAY_TASK_ID <\/em>stores the value of the array index.<\/p>\n<p>Additionally, when specifying a job&#8217;s STDOUT and STDERR files using the <em>-o<\/em> and <em>-e<\/em> directives to <em>sbatch<\/em>, the reference <em>%A <\/em>will take on the parent job ID and the reference <em>%a<\/em> will take on the iteration index. In summary:<\/p>\n\n<table id=\"tablepress-1\" class=\"tablepress tablepress-id-1\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">BASH Environment Variable<\/th><th class=\"column-2\">SBATCH Field Code<\/th><th class=\"column-3\">Description<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">$SLURM_JOB_ID<\/td><td class=\"column-2\">%J<\/td><td class=\"column-3\">Job identifier<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">$SLURM_ARRAY_JOB_ID<\/td><td class=\"column-2\">%A<\/td><td class=\"column-3\">Array parent job identifier<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">$SLURM_ARRAY_TASK_ID<\/td><td class=\"column-2\">%a<\/td><td class=\"column-3\">Array job iteration index<\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">$SLURM_ARRAY_TASK_COUNT<\/td><td class=\"column-2\"><\/td><td class=\"column-3\">Number of indexes (tasks) in the job array<\/td>\n<\/tr>\n<tr class=\"row-6\">\n\t<td class=\"column-1\">$SLURM_ARRAY_TASK_MAX<\/td><td class=\"column-2\"><\/td><td class=\"column-3\">Maximum array index<\/td>\n<\/tr>\n<tr class=\"row-7\">\n\t<td class=\"column-1\">$SLURM_ARRAY_TASK_MIN<\/td><td class=\"column-2\"><\/td><td class=\"column-3\">Minimum array index<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-1 from cache -->\n<p>And so, with this example script:<br \/>\n<pre class=\"preserve-code-formatting\">#!\/bin\/bash\n\n#SBATCH -J arraytest\n#SBATCH --array=0-4\n#SBATCH -o output-%A_%a-%J.o\n#SBATCH -n 1\n\necho SLURM_JOB_ID $SLURM_JOB_ID\necho SLURM_ARRAY_JOB_ID $SLURM_ARRAY_JOB_ID\necho SLURM_ARRAY_TASK_ID $SLURM_ARRAY_TASK_ID<\/pre><br \/>\nWe can submit the script:<br \/>\n<pre class=\"preserve-code-formatting\">[test.user@cstl001 sbatch]$ sbatch array.sh \nSubmitted batch job 231<\/pre><br \/>\nResulting in the following output files:<br \/>\n<pre class=\"preserve-code-formatting\">output-231_0-232.o\noutput-231_1-233.o\noutput-231_2-234.o\noutput-231_3-235.o\noutput-231_4-231.o<\/pre><br \/>\nEach iteration of which contained variables as follows:<br \/>\n<pre class=\"preserve-code-formatting\">output-231_0-232.o:\nSLURM_JOB_ID 232\nSLURM_ARRAY_JOB_ID 231\nSLURM_ARRAY_TASK_ID 0<\/pre><br \/>\n<pre class=\"preserve-code-formatting\">output-231_1-233.o:\nSLURM_JOB_ID 233\nSLURM_ARRAY_JOB_ID 231\nSLURM_ARRAY_TASK_ID 1<\/pre><br \/>\n<pre class=\"preserve-code-formatting\">output-231_2-234.o:\nSLURM_JOB_ID 234\nSLURM_ARRAY_JOB_ID 231\nSLURM_ARRAY_TASK_ID 2<\/pre><br \/>\n<pre class=\"preserve-code-formatting\">output-231_3-235.o:\nSLURM_JOB_ID 235\nSLURM_ARRAY_JOB_ID 231\nSLURM_ARRAY_TASK_ID 3<\/pre><br \/>\n<pre class=\"preserve-code-formatting\">output-231_4-231.o:\nSLURM_JOB_ID 231\nSLURM_ARRAY_JOB_ID 231\nSLURM_ARRAY_TASK_ID 4<\/pre><br \/>\nMore advanced job array information is available in the Slurm documentation <a href=\"http:\/\/cf-por-00a.hpcwales.local\/slurm\/html\/job_array.html\">here<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<h3>Task Geometry<\/h3>\n<p>If you need to run an MPI (\/OpenMP) task that requires a custom task geometry, perhaps because one task requires a larger amount of memory than the others, then this can easily be achieved with Slurm.<\/p>\n<p>To do this, rather than specifiying the number of processors required, one can specify the number of nodes (#SBATCH &#8211;nodes=X) plus the number of tasks per node (#SBATCH &#8211;tasks-per-node=X). The geometry can then be defined to the SLURM_TASKS_PER_NODE environment variable at runtime. As long as there are enough nodes to match the geometry, then Slurm will allocate parallel tasks to the MPI runtime to follow the geometry specification.<\/p>\n<p>For example:<br \/>\n<pre class=\"preserve-code-formatting\">#!\/bin\/bash --login\n\n#SBATCH --job-name geom_test\n#SBATCH --nodes 4\n#SBATCH --ntasks-per-node 16\n#SBATCH --time 00:10:00\n#SBATCH --output geom_test.%J.out\n\nmodule purge\nmodule load mpi\/intel\/5.1\n\nexport SLURM_TASKS_PER_NODE=&#039;1,16(x2),6&#039;\nmpirun .\/mpi_test<\/pre><br \/>\nIn this case, we are requesting 4 nodes and (all) 16 processors on those 4 nodes. Therefore, a maximum job size of 64 parallel tasks (to match the number of allocated processors) would apply. However, we override the SLURM_TASKS_PER_NODE environment variable to be just a single task on the first node, then fill the next two allocated nodes, and then place just six parallel tasks on the final allocated node. So, in this case, a total of 1+16+16+6=39 parallel processes. &#8216;mpirun&#8217; will automatically pick this up from the Slurm allocated runtime environment.<\/p>\n<p>&nbsp;<\/p>\n<h3>Parallel Batch Submission of Serial Jobs<\/h3>\n<p>Large numbers of serial jobs can become incredibly inefficient and troublesome on mixed-mode HPC systems. The HPCW Slurm deployment limits the number of running &amp; submitted jobs any single user may have, in contrast to the unlimited submission possible under the previous deployment of LSF.<\/p>\n<p>However, combining <a href=\"http:\/\/www.gnu.org\/software\/parallel\/\" target=\"_blank\"><em>GNU Parallel<\/em> <\/a>and Slurm&#8217;s<em> srun <\/em>command allows us to handle such situations in a more controlled and efficient way than in the past. Using this method, a single job is submitted that requests an allocation of X cores, and the GNU <em>paralllel<\/em> command enables us to utilise all of those cores by launching the serial tasks using the <em>srun<\/em> command.<\/p>\n<p>Here is the example, commented, job submission file <em>serial_batch.sh<\/em>:<br \/>\n<pre class=\"preserve-code-formatting\">#!\/bin\/bash --login\n#SBATCH -n 12&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #Number of processors in our pool\n#SBATCH -o output.%J&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#Job output\n#SBATCH -t 12:00:00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #Max wall time for entire job\n\nmodule purge\nmodule load parallel\n\n# Define srun arguments:\nsrun=&quot;srun -n1 -N1 --exclusive&quot;\n# --exclusive&nbsp;&nbsp;&nbsp;&nbsp; ensures srun uses distinct CPUs for each job step\n# -N1 -n1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; allocates a single core to each task\n\n# Define parallel arguments:\nparallel=&quot;parallel -N 1 --delay .2 -j $SLURM_NTASKS --joblog parallel_joblog --resume&quot;\n# -N 1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;is number of arguments to pass to each job\n# --delay .2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;prevents overloading the controlling node on short jobs\n# -j $SLURM_NTASKS&nbsp;&nbsp;is the number of concurrent tasks parallel runs, so number of CPUs allocated\n# --joblog name&nbsp;&nbsp;&nbsp;&nbsp; parallel&#039;s log file of tasks it has run\n# --resume&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;parallel can use a joblog and this to continue an interrupted run (job resubmitted)\n\n# Run the tasks:\n$parallel &quot;$srun .\/runtask arg1:{1}&quot; ::: {1..32}\n# in this case, we are running a script named runtask, and passing it a single argument\n# {1} is the first argument\n# parallel uses ::: to separate options. Here {1..32} is a shell expansion defining the values for the first argument, but could be any shell command\n#\n# so parallel will run the runtask script for the numbers 1 through 32, with a max of 12 running at any one time\n#\n# as an example, the first job will be run like this:\n# srun -N1 -n1 --exclusive .\/runtask arg1:1<\/pre><br \/>\nSo, in the above we are requesting an allocation from Slurm of 12 processors, but we have 32 tasks to run. Parallel will execute the jobs as soon as space on our allocation becomes available (i.e. tasks finish). As this does not have the overhead of setting up a new full job, it is more efficient.<\/p>\n<p>A simple &#8216;runtask&#8217; script that demonstrates the principal by logging helpful text is included here, courtesy of the <a href=\"https:\/\/rcc.uchicago.edu\/docs\/running-jobs\/srun-parallel\/index.html#parallel-batch\" target=\"_blank\">University of Chicago Research Computing Centre<\/a>:<br \/>\n<pre class=\"preserve-code-formatting\">&lt;span class=&quot;c&quot;&gt;#!\/bin\/sh&lt;\/span&gt;\n\n&lt;span class=&quot;c&quot;&gt;# this script echoes some useful output so we can see what parallel&lt;\/span&gt;\n&lt;span class=&quot;c&quot;&gt;# and srun are doing&lt;\/span&gt;\n\n&lt;span class=&quot;nv&quot;&gt;sleepsecs&lt;\/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;\/span&gt;&lt;span class=&quot;nv&quot;&gt;$[&lt;\/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;\/span&gt; &lt;span class=&quot;nv&quot;&gt;$RANDOM&lt;\/span&gt; % &lt;span class=&quot;m&quot;&gt;10&lt;\/span&gt; &lt;span class=&quot;o&quot;&gt;)&lt;\/span&gt;&nbsp;&nbsp;+ &lt;span class=&quot;m&quot;&gt;10&lt;\/span&gt; &lt;span class=&quot;o&quot;&gt;]&lt;\/span&gt;s\n\n&lt;span class=&quot;c&quot;&gt;# $1 is arg1:{1} from parallel.&lt;\/span&gt;\n&lt;span class=&quot;c&quot;&gt;# $PARALLEL_SEQ is a special variable from parallel. It the actual sequence&lt;\/span&gt;\n&lt;span class=&quot;c&quot;&gt;# number of the job regardless of the arguments given&lt;\/span&gt;\n&lt;span class=&quot;c&quot;&gt;# We output the sleep time, hostname, and date for more info&lt;\/span&gt;\n&lt;span class=&quot;nb&quot;&gt;echo &lt;\/span&gt;task &lt;span class=&quot;nv&quot;&gt;$1&lt;\/span&gt; seq:&lt;span class=&quot;nv&quot;&gt;$PARALLEL_SEQ&lt;\/span&gt; sleep:&lt;span class=&quot;nv&quot;&gt;$sleepsecs&lt;\/span&gt; host:&lt;span class=&quot;k&quot;&gt;$(&lt;\/span&gt;hostname&lt;span class=&quot;k&quot;&gt;)&lt;\/span&gt; date:&lt;span class=&quot;k&quot;&gt;$(&lt;\/span&gt;date&lt;span class=&quot;k&quot;&gt;)&lt;\/span&gt;\n\n&lt;span class=&quot;c&quot;&gt;# sleep a random amount of time&lt;\/span&gt;\nsleep &lt;span class=&quot;nv&quot;&gt;$sleepsecs&lt;\/span&gt;<\/pre><br \/>\nSo, one would simply submit the job script as per normal:<br \/>\n<pre class=\"preserve-code-formatting\">$ sbatch serial_batch.sh<\/pre><br \/>\nAnd we then see output in the Slurm job output file like this:<br \/>\n<pre class=\"preserve-code-formatting\">...\ntask arg1:9 seq:9 sleep:10s host:bwc048 date:Thu Jan 28 17:28:14 GMT 2016\ntask arg1:7 seq:7 sleep:11s host:bwc047 date:Thu Jan 28 17:28:14 GMT 2016\ntask arg1:10 seq:10 sleep:11s host:bwc048 date:Thu Jan 28 17:28:14 GMT 2016\ntask arg1:8 seq:8 sleep:14s host:bwc047 date:Thu Jan 28 17:28:14 GMT 2016\n...<\/pre><br \/>\nAlso the parallel job log records completed tasks:<br \/>\n<pre class=\"preserve-code-formatting\">Seq\u00a0\u00a0\u00a0\u00a0 Host\u00a0\u00a0\u00a0 Starttime\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 JobRuntime\u00a0\u00a0\u00a0\u00a0\u00a0 Send\u00a0\u00a0\u00a0 Receive Exitval Signal\u00a0 Command\n9\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 :\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1454002094.231\u00a0\u00a0\u00a0\u00a0\u00a0 10.588\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 74\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 srun -n1 -N1 --exclusive .\/runtask arg1:9\n7\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 :\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1454002093.809\u00a0\u00a0\u00a0\u00a0\u00a0 11.602\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 74\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 srun -n1 -N1 --exclusive .\/runtask arg1:7\n10\u00a0\u00a0\u00a0\u00a0\u00a0 :\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1454002094.435\u00a0\u00a0\u00a0\u00a0\u00a0 11.384\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 76\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 srun -n1 -N1 --exclusive .\/runtask arg1:10\n8\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 :\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1454002094.023\u00a0\u00a0\u00a0\u00a0\u00a0 14.388\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 74\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 srun -n1 -N1 --exclusive .\/runtask arg1:8\n...<\/pre><br \/>\nSo, by tweeking a few simple commands in the job script and having a &#8216;runtask&#8217; script that does something useful, we can accomplish a neat, efficient serial batch system.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Interactive Use In order to use an HPC system interactively &#8211; i.e. whilst sat in front of the terminal interacting live with allocated resources &#8211; there is a simple two stage process in Slurm. Firstly, we must create an allocation &#8211; that is, an allocation of a certain amount of resources that we specify we [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":33,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-nosidebar.php","meta":{"_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"class_list":["post-42","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/pages\/42","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/comments?post=42"}],"version-history":[{"count":17,"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/pages\/42\/revisions"}],"predecessor-version":[{"id":247,"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/pages\/42\/revisions\/247"}],"up":[{"embeddable":true,"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/pages\/33"}],"wp:attachment":[{"href":"https:\/\/portal.supercomputing.wales\/index.php\/wp-json\/wp\/v2\/media?parent=42"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}