Submitting, monitoring, and canceling jobs ========================================== .. toctree:: :maxdepth: 1 :caption: Contents: .. role:: underline :class: underline About ----- The goal of this document is to provide an overview of the basic methods from submitting, monitoring, and cancelling running or submitted jobs. Before moving forward with the recommendations provided below in this document, it is imperative that the users who will be managing jobs on Discoverer: #. Have a valid account to access the Discoverer cluster #. Be aware of the way we at Discoverer allocate and count the computational resources (see :doc:`computational_resources_allocation`) #. Read the document :doc:`writing_slurm_batch` Job submission -------------- Batch script submission ....................... .. warning:: This way of submitting jobs is the most preferred one. .. important:: Before submitting a script to the queue, be sure the script code matches the requirements in :doc:`writing_slurm_batch`. The easiest way to send a Slurm batch script to the queue is to execute on the login node the following command line: .. code-block:: bash sbatch job.batch where ``job.batch`` is the file containing the script lines necessary for running the job. One may use this document: https://slurm.schedmd.com/sbatch.html to determine which additional options, if required, may be incorporated into the script or passed as arguments to ``sbatch``. Upon successful submission, ``sbatch`` returns the assigned job ID (integer number). Later, the submitted job can be monitored or cancelled based on that ID. Interactive job submission and execution ........................................ .. warning:: This way of running jobs is not promoted or supported by us. Monitoring of submitted jobs ---------------------------- The easiest way to monitor the job execution is to execute: .. code-block:: bash squeue jobID where ``jobID`` is the job ID identifier provided by ``sbatch`` upon submission. If you like to list all your submitted jobs: .. code-block:: bash squeue --me The ``squeue`` tool can provide extended information regarding the job size and its executuion. This document: https://slurm.schedmd.com/squeue.html lists the full list of options one can pass to ``squeue``. Canceling jobs .............. Only successfully submited jobs can be canceled. To cancel a job with ID ``jobID`` use ``scancel``: .. code-block:: bash scancel jobID .. warning:: If a job cannot be canceled, ask the Support to do that (see :doc:`help`). Getting help ------------ See :doc:`help`