Anaconda/Conda¶
Table of Contents
- Anaconda/Conda
- About
- Why not to install Anaconda and
condaon a per-user basis - How to use
condafrom our Anaconda installation via environment modules - Using
condaon Discoverer (CPU cluster) - Using
condaon Discoverer+ (GPU cluster) - How to use the created Python virtual environments
- Important Reminders
- Getting Help
About¶
For comprehensive information about Anaconda and conda, refer to the official Anaconda documentation and the conda user guide.
This guide explains how to properly access and use conda from the Anaconda installation supported on the Discoverer and Discoverer+ high-performance computing clusters.
It is crucial that users do NOT install their own versions of Anaconda, as this wastes valuable storage space on the cluster’s shared storage systems.
Why not to install Anaconda and conda on a per-user basis¶
Assures storage efficiency:
- Personal Anaconda installations consume massive amounts of storage (typically 3-5 GB per installation)
- Multiple users installing Anaconda individually can quickly exhaust cluster storage quotas
- Most of the installed packages are rarely used, leading to inefficient storage utilization
- Shared storage space is limited and expensive on HPC clusters
Provides excellent system integration:
- Environment modules provide optimized, pre-configured Anaconda installations
- Automatic PATH configuration ensures
condaexecutable is immediately accessible in Bash shell - Version management allows switching between different Anaconda versions as needed
- Consistent environment across all cluster nodes
How to use conda from our Anaconda installation via environment modules¶
Do not run conda on the login nodes! We strongly recommend utilising conda from within Slurm batch scripts or by running Bash shell interactively on compute nodes managed by srun.
Using conda on Discoverer (CPU cluster)¶
Warning
Never run conda directly on the login node (login.discoverer.bg).
Warning
Do not install Python virtual environments using conda in your home folder due to limited storage capacity. Always use the project storage location instead.
Interactively¶
You can start an interactive Bash session using srun and work with conda within that session:
srun --partition=cn --nodes=1 --ntasks-per-node=2 --mem=2G \
--cpus-per-task=1 --account=<your_slurm_project_account_name> \
--time=00:30:00 --pty /bin/bash
Once Slurm starts the requested interactive Bash session, load the environment module anaconda:
module load anaconda
to enable access to conda tool. Afterwards, you may use conda for managing Python virtual environments.
Here is an example showing how to create a Python 3.13 virtual environment using the interactive Bash session started as explained above, and store the environment content inside your project storage folder:
conda create --prefix=/valhalla/projects/<your_slurm_project_account_name>/virtual/python-3.13 \
python=3.13
Warning
Do not follow the suggestion displayed by conda to manually execute this:
conda init
To understand why such a suggestion is a bad idea on HPC clusters, read carefully How to use the created Python virtual environments below.
Through a Slurm job¶
You may achieve this by creating a Slurm job script with content similar to this one (it assumes the use of your default QoS):
#!/bin/bash
#SBATCH --partition=cn
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --mem=2G
#SBATCH --cpus-per-task=1
#SBATCH --account=<your_slurm_project_account_name>
#SBATCH --time=00:30:00
module load anaconda
# This example creates Python 3.13 virtual environment
# inside your project storage folder:
conda create --prefix=/valhalla/projects/<your_slurm_project_account_name>/virtual/python-3.13 \
python=3.13
Warning
Do not follow the displayed message after the installation where conda suggests to execute manually
conda init
Using conda on Discoverer+ (GPU cluster)¶
Warning
Never run conda directly on the login node (login-plus.discoverer.bg).
Warning
Do not install Python virtual environments using conda in your home folder due to limited storage capacity. Always use the project storage location instead.
Note
To prevent exhausting your GPU resource allocation, conda-driven installations should utilise your account via QoS that allows the execution of CPU-only jobs, as shown in the examples below.
Interactively¶
You can start an interactive Bash session using srun and work with conda within that session:
srun --partition=common --nodes=1 --ntasks-per-node=2 --mem=2G \
--cpus-per-task=1 --account=<your_slurm_project_account_name> \
--qos=2cpu-single-host \
--time=00:30:00 --pty /bin/bash
Once Slurm starts the requested interactive Bash session, load the environment module anaconda:
module load anaconda
to enable access to conda tool. Afterwards, you may use conda for managing Python virtual environments.
Here is an example showing how to create a Python 3.13 virtual environment using the interactive Bash session started as explained above, and store the environment content inside your project storage folder:
conda create --prefix=/valhalla/projects/<your_slurm_project_account_name>/virtual/python-3.13 \
python=3.13
Warning
Do not follow the suggestion displayed by conda to manually execute this:
conda init
To understand why such a suggestion is a bad idea on HPC clusters, read carefully How to use the created Python virtual environments below.
Through a Slurm job¶
You may achieve this by creating a Slurm job script with content similar to this one:
#!/bin/bash
#SBATCH --partition=common
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --mem=2G
#SBATCH --cpus-per-task=1
#SBATCH --account=<your_slurm_project_account_name>
#SBATCH --qos=2cpu-single-host
#SBATCH --time=00:30:00
module load anaconda
# This example creates Python 3.13 virtual environment
# inside your project storage folder:
conda create --prefix=/valhalla/projects/<your_slurm_project_account_name>/virtual/python-3.13 \
python=3.13
Warning
Do not follow the suggestion displayed by conda to manually execute this:
conda init
To understand why such a suggestion is a bad idea on HPC clusters, read carefully How to use the created Python virtual environments below.
How to use the created Python virtual environments¶
Note that the Python virtual environments, in many cases, may be used instead of containers. As long as you need only to isolate an application in a specific environment of libraries and environment variables, the Python virtual environments are the easiest way to do so.
The HPC systems, and most AI factory setups with shared environment, for that matter, may host and extensively use numerous Python virtual environments on a per-user basis. Sometimes the fine graining of the Python virtual environment reaches such levels at which a separate Python virtual environment backs the execution of a single application.
Therefore, it may become problematic if the users follow the advice emitted by conda to execute:
conda init
Such execution changes ~/.bashrc file and affects the Bash sessions of the users by pointing in most cases the environment into one certain Python virtual environment. That may create ambiguity or hard to trace problems since it adds another layer of complexity to the application execution.
So our advice is to avoid embedding the Python virtual environments created by conda into users’ Bash profiles, unless it is absolutely necessary.
Below are the steps showing how to use the already installed Python virtual environment without the need to modify ~/.bashrc.
Step 1: Load the anaconda module¶
To make the Anaconda installation (not only conda) available in your session, load the module:
module load anaconda
Step 2: Add access to a selected Python virtual environment to the current Bash session¶
In the same Bash session, whereupon the anaconda module is loaded, you need to modify or create the Bash environment variables PATH and VIRTUAL_ENV in such a way that they point to the installation of the Python virtual environment. For example:
export PATH="/valhalla/projects/<your_slurm_project_account_name>/pytorch_env_nomkl/bin:$PATH"
export VIRTUAL_ENV="/valhalla/projects/<your_slurm_project_account_name>/pytorch_env_nomkl"
Here we presume the virtual environment is created under the /valhalla/projects/your_slurm_project_account_name/pytorch_env_nomkl folder.
This approach allows you to select which Python virtual environment to use in your current Bash session. When the session ends or you reset the environment variables, the selected Python virtual environment is not accessible by default any longer.
Step 3: Run the application in the same Bash session within the Python virtual environment¶
module load nvidia/cuda/12/12.8
python /valhalla/projects/<your_slurm_project_account_name>/pytorch_gpu_detection.py
Important Reminders¶
- DO NOT install Anaconda manually - always use
module load anaconda- Clean up unused environments to reduce the storage space utilisation
- Use project-specific environments rather than global installations
- Include
module load anacondain your job scripts when usingcondaor Python virtual environment created byconda