Conda (CPU)

Warning

ALL conda operations MUST be performed on the compute nodes via SLURM batch jobs or srun interactive runs.

NEVER run conda create, conda install, conda update, or any other conda commands directly on login nodes.

Login nodes are shared resources. Running conda-related I/O intensive operations on login nodes:

  • Violates resource accounting policies
  • Degrades performance for all users
  • Can result in account restrictions or termination

Overview

This documentation describes how to install packages using conda in the project environment in a way that matches all requirements for versions and dependencies of the provisioned project. Conda environments on Discoverer are created using the --prefix option to specify a custom location (typically in your project directory). This approach avoids using conda activate and instead relies on environment variable configuration to ensure proper version and dependency management.

Important

ALL conda operations MUST be performed on the compute nodes via SLURM batch jobs or srun interactive runs.

Warning

Do NOT use conda init or conda activate.

Avoid using conda init and conda activate for the following reasons:

  • Interference with other virtual environments: conda activate can interfere with other virtual environment managers (e.g., venv, virtualenv, pipenv) and cause conflicts
  • Pollution of .bashrc: conda init modifies your .bashrc file, which can cause issues in shared environments and HPC systems
  • Less explicit control: The activation mechanism is less transparent than explicit environment variable exports
  • Potential conflicts: Automatic conda initialization can interfere with system modules and other environment setups

Instead, use explicit environment variable exports as shown in this documentation:

export PATH=/valhalla/projects/<your_project_name>/venv-1/bin:${PATH}
export LD_LIBRARY_PATH=/valhalla/projects/<your_project_name>/venv-1/lib:${LD_LIBRARY_PATH}
export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1

This approach is:

  • Clear and explicit: You can see exactly which environment is being used
  • Non-intrusive: Does not modify system configuration files
  • Compatible: Works well with other virtual environment tools
  • HPC-friendly: Ideal for shared computing environments

Tip

Important: Do NOT install Miniconda or Anaconda in your home directory.

Installing Miniconda or Anaconda in your home folder consumes significant disk space unnecessarily. The system provides conda through the anaconda3 environment module, which is already available and can be used to install any package that is installable with conda.

Always use the system-provided conda by loading the module:

module load anaconda3

This module is automatically loaded in all SLURM batch scripts shown in this documentation. There is no need to install your own conda distribution.

Conda Configuration: Fixing “No space left on device” Error

Warning

Important: Before creating conda environments, ensure your conda configuration uses directories with sufficient disk space. By default, conda may try to use /tmp which has limited space (typically 320M and 95% full), causing “No space left on device” errors even when your target directory has plenty of space.

Problem

Conda fails with NoSpaceLeftError: No space left on devices even when the target directory has plenty of space. This happens because conda’s configuration file (~/.condarc) has pkgs_dirs and envs_dirs pointing to /tmp, which is on a different filesystem with limited space.

Diagnosis

Check filesystem space usage:

# Check space on /tmp and your target directory
df -h /tmp /valhalla/projects/<your_project_name>/

# Check which filesystem /tmp is on
df -h /tmp

Expected output shows: - /tmp is on /dev/mapper/live-rw with only 320M free (95% full) - /valhalla has 50T available

Check conda’s configuration:

module load anaconda3
conda info

Look for: - package cache : /tmp/... - This is where conda stores downloaded packages - envs directories : /tmp/... - This is where conda stores environments

Check your conda config file:

module load anaconda3
cat ~/.condarc

The problem: Your ~/.condarc likely contains:

pkgs_dirs:
  - /tmp/moose.ntCRtFtw/_env/.pkgs
envs_dirs:
  - /tmp/moose.ntCRtFtw/_env/.envs

These directories are on /tmp which is 95% full. Conda needs space in pkgs_dirs to: - Download packages - Extract packages - Build packages from source - Cache package metadata

These operations can require significant space (often several GB for Python 3.12 with dependencies).

Solution

Option 1: Update conda configuration (Recommended - Permanent Fix)

Modify your ~/.condarc file to use /valhalla for package cache and environments:

module load anaconda3

# Create directories on /valhalla
mkdir -p /valhalla/projects/<your_project_name>/conda/pkgs
mkdir -p /valhalla/projects/<your_project_name>/conda/envs

# Update conda configuration (use --add for list parameters)
conda config --add pkgs_dirs /valhalla/projects/<your_project_name>/conda/pkgs
conda config --add envs_dirs /valhalla/projects/<your_project_name>/conda/envs

# Verify the change
conda info | grep -E "(package cache|envs directories)"

Note: Use --add (not --set) because pkgs_dirs and envs_dirs are list parameters. The --add command adds the directory to the beginning of the list, and conda uses the first writable location.

Option 2: Edit ~/.condarc manually

Alternatively, you can edit ~/.condarc directly:

# Backup your current config
cp ~/.condarc ~/.condarc.backup

# Edit the file
nano ~/.condarc  # or use your preferred editor

Change:

pkgs_dirs:
  - /tmp/moose.ntCRtFtw/_env/.pkgs
envs_dirs:
  - /tmp/moose.ntCRtFtw/_env/.envs

To:

pkgs_dirs:
  - /valhalla/projects/<your_project_name>/conda/pkgs
envs_dirs:
  - /valhalla/projects/<your_project_name>/conda/envs

Option 3: Use environment variable for single command

For a one-time fix without changing your config:

module load anaconda3

# Create temp directory on /valhalla
mkdir -p /valhalla/projects/<your_project_name>/conda/pkgs

# Override pkgs_dirs for this command only
CONDA_PKGS_DIRS=/valhalla/projects/<your_project_name>/conda/pkgs conda create --prefix /valhalla/projects/<your_project_name>/venv/llvmlite python3.12 -y

Option 4: Set TMPDIR (for additional temp operations)

Even after fixing pkgs_dirs, you may also want to set TMPDIR for other temporary operations:

export TMPDIR=/valhalla/projects/<your_project_name>/tmp
mkdir -p $TMPDIR

Verification

After updating conda configuration, verify it’s using the correct directories:

module load anaconda3
conda info | grep -E "(package cache|envs directories)"

You should see:

package cache : /valhalla/projects/<your_project_name>/conda/pkgs
envs directories : /valhalla/projects/<your_project_name>/conda/envs
                  /opt/software/anaconda3/envs
                  /home/<username>/.conda/envs

Also verify your config file:

cat ~/.condarc | grep -A 2 -E "(pkgs_dirs|envs_dirs)"

Initial setup

Warning

ALL conda operations MUST be performed on the compute nodes via SLURM batch jobs or srun interactive runs.

The initial setup process consists of creating a SLURM batch script file (on the login node, using a text editor) and then submitting it to run on compute nodes. All conda operations execute on compute nodes.

Step 1: Create the SLURM installation script file

On the login node: Create a text file named install_conda_env.sh using your preferred text editor (nano, vim, emacs, etc.). This is the ONLY step that involves working on the login node - you are just creating a text file.

The file should contain the following SLURM batch script (change the package list to the one that fits your goals):

#!/bin/bash
#SBATCH --job-name=install_conda_env
#SBATCH --output=install_env_%j.out
#SBATCH --error=install_env_%j.err
#SBATCH --time=02:00:00
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=1
#SBATCH --nodes=1
#SBATCH --partition=cn
#SBATCH --account=your_project
#SBATCH --mem=16G

# Load Anaconda module
module load anaconda3

# Set TMPDIR to project directory to avoid using /tmp
export TMPDIR=/valhalla/projects/<your_project_name>/tmp
mkdir -p $TMPDIR

# Set virtual environment path (replace with your project name)
export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1

# Create conda environment
echo "Creating conda environment at ${VIRTUAL_ENV}"
conda create --prefix ${VIRTUAL_ENV} python=3.12.2 -y

# Set up environment variables
export PATH=${VIRTUAL_ENV}/bin:${PATH}
export LD_LIBRARY_PATH=${VIRTUAL_ENV}/lib:${LD_LIBRARY_PATH}

# Install packages from conda-forge
echo "Installing packages from conda-forge"
conda install --prefix ${VIRTUAL_ENV} -c conda-forge scikit-learn scipy numpy matplotlib mdanalysis pandas -y

# Verify installation
echo "Verifying installation..."
which python
python --version
conda list --prefix ${VIRTUAL_ENV}

echo "Installation complete!"

Configuration parameters: - Replace <your_project_name> with your actual project name - Adjust --time based on the number of packages (installation can take 30 minutes to 2 hours) - Increase --mem if installing large packages (16G is usually sufficient) - Modify the package list as needed - The -y flag automatically confirms the installation - Specify the Python version that meets your requirements

Alternative: Interactive installation using srun

Instead of submitting a batch job, you can run the installation interactively on a compute node using srun. This is useful if you want to see the output in real-time or interact with the installation process.

On the login node (login.discoverer.bg): Run the following command to start an interactive session on a compute node:

srun --job-name=install_conda_env \
     --time=02:00:00 \
     --ntasks-per-node=2 \
     --cpus-per-task=1 \
     --nodes=1 \
     --partition=cn \
     --account=your_project \
     --mem=16G \
     --pty bash

Once the interactive session starts, execute the installation commands:

# Load Anaconda module
module load anaconda3

# Set TMPDIR to project directory to avoid using /tmp
export TMPDIR=/valhalla/projects/<your_project_name>/tmp
mkdir -p $TMPDIR

# Set virtual environment path (replace with your project name)
export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1

# Create conda environment
echo "Creating conda environment at ${VIRTUAL_ENV}"
conda create --prefix ${VIRTUAL_ENV} python=3.12.2 -y

# Set up environment variables
export PATH=${VIRTUAL_ENV}/bin:${PATH}
export LD_LIBRARY_PATH=${VIRTUAL_ENV}/lib:${LD_LIBRARY_PATH}

# Install packages from conda-forge
echo "Installing packages from conda-forge"
conda install --prefix ${VIRTUAL_ENV} -c conda-forge scikit-learn scipy numpy matplotlib mdanalysis pandas -y

# Verify installation
echo "Verifying installation..."
which python
python --version
conda list --prefix ${VIRTUAL_ENV}

echo "Installation complete!"

Note: The --pty bash flag allocates a pseudo-terminal, allowing you to interact with the session. When you’re done, type exit to end the interactive session.

Step 2: Submit the SLURM job

On the login node: Submit the batch script to the SLURM scheduler. The job will execute on compute nodes where all conda operations will run:

sbatch install_conda_env.sh

After submission, SLURM will display a job ID (e.g., Submitted batch job 12345). This job ID is used in the output and error filenames.

On the login node: Verify that your job is queued (this is a read-only operation, safe on login node):

squeue --me

This command shows all jobs submitted by your user account. You can monitor the job status and wait for it to complete.

Step 3: Check the results

On the login node: After the job completes, check the output and error files:

# View the output file (replace 12345 with your job ID)
cat install_env_12345.out

# Check for errors
cat install_env_12345.err

If the installation was successful, you should see messages indicating that packages were installed and the environment was created.

When to create a new environment vs. reuse existing

Scenario 1: Adding compatible packages - Action: Reuse existing environment - Reason: New packages (e.g., pandas, matplotlib) are compatible with existing packages

Scenario 2: Need Python 3.11 instead of 3.12 - Action: Create new environment (venv-python311) - Reason: Python version requirement differs

Scenario 3: Package conflict detected - Action: Create new environment for the conflicting package set - Reason: Conda reports dependency conflicts that cannot be resolved

Scenario 4: Starting a new project - Action: Evaluate if existing environment meets requirements

  • If yes: Reuse
  • If no or conflicts: Create new environment with project-specific name

Using the virtual environment

Step 1: Set the environment variables

Before running your Python scripts, configure the environment variables:

export PATH=/valhalla/projects/<your_project_name>/venv-1/bin:${PATH}
export LD_LIBRARY_PATH=/valhalla/projects/<your_project_name>/venv-1/lib:${LD_LIBRARY_PATH}
export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1

Step 2: Run your Python script

Execute your Python script normally:

python script.py

All packages installed in the conda environment will be available to your script.

SLURM batch script example

Here’s how to set up the environment in a SLURM batch script:

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --output=job_%j.out
#SBATCH --error=job_%j.err
#SBATCH --time=01:00:00
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --nodes=1
#SBATCH --partition=cn
#SBATCH --account=your_project
#SBATCH --mem=8G

# Load Anaconda module
module load anaconda3

# Set TMPDIR to project directory to avoid using /tmp
export TMPDIR=/valhalla/projects/<your_project_name>/tmp
mkdir -p $TMPDIR

# Set up conda environment
export PATH=/valhalla/projects/<your_project_name>/venv-1/bin:${PATH}
export LD_LIBRARY_PATH=/valhalla/projects/<your_project_name>/venv-1/lib:${LD_LIBRARY_PATH}
export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1

# Run your Python script
python script.py

Implementation details

  1. No conda activate or conda init: This method uses explicit environment variable exports instead of conda activate or conda init. This avoids interference with other virtual environments, prevents pollution of .bashrc, and provides clearer control over the environment setup.
  2. Custom paths: Environments are created in project directories using --prefix
  3. Persistent setup: Environment variables must be set each time you want to use the environment
  4. SLURM compatibility: This approach works well in SLURM batch scripts
  5. Login node restrictions: All conda create, install, and update operations must be performed through SLURM batch jobs on compute nodes, not on login nodes
  6. Disk space management: Ensure conda is configured to use /valhalla for package cache and environments to avoid “No space left on device” errors

Troubleshooting

Verify environment setup

On the login node: Lightweight, read-only verification commands can be run on login nodes after setting environment variables (these do not execute conda commands):

# Set environment variables (read-only operation, safe on login node)
export PATH=/valhalla/projects/<your_project_name>/venv-1/bin:${PATH}
export LD_LIBRARY_PATH=/valhalla/projects/<your_project_name>/venv-1/lib:${LD_LIBRARY_PATH}
export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1

# Verify Python path (read-only, safe on login node)
which python
# Should show: /valhalla/projects/<your_project_name>/venv-1/bin/python

python -c "import sys; print(sys.executable)"
# Should show the path to your environment's Python

Note: Any verification that requires conda commands must be performed in a SLURM batch job, not on the login node.

Check the installed packages

Warning

The conda list command MUST be run on compute nodes via SLURM batch jobs or srun interactive runs, NOT on the login node.

To check installed packages, create and submit a SLURM batch script:

#!/bin/bash
#SBATCH --job-name=list_packages
#SBATCH --output=list_packages_%j.out
#SBATCH --error=list_packages_%j.err
#SBATCH --time=00:05:00
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --nodes=1
#SBATCH --partition=cn
#SBATCH --account=your_project
#SBATCH --mem=2G

module load anaconda3
conda list --prefix /valhalla/projects/<your_project_name>/venv-1

Alternative: Interactive check using srun

Instead of submitting a batch job, you can check installed packages interactively on a compute node using srun.

On the login node (login.discoverer.bg): Run the following command to start an interactive session on a compute node:

srun --job-name=list_packages \
     --time=00:05:00 \
     --ntasks-per-node=1 \
     --cpus-per-task=1 \
     --nodes=1 \
     --partition=cn \
     --account=your_project \
     --mem=2G \
     --pty bash

Once the interactive session starts, execute the command:

module load anaconda3
conda list --prefix /valhalla/projects/<your_project_name>/venv-1

Note: The --pty bash flag allocates a pseudo-terminal, allowing you to interact with the session. When you’re done, type exit to end the interactive session.

Update packages

Warning

Package updates MUST be performed on compute nodes via SLURM batch jobs or srun interactive runs, NOT on login nodes.

Update packages in the environment using a SLURM batch script:

#!/bin/bash
#SBATCH --job-name=update_packages
#SBATCH --output=update_packages_%j.out
#SBATCH --error=update_packages_%j.err
#SBATCH --time=01:00:00
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=1
#SBATCH --nodes=1
#SBATCH --partition=cn
#SBATCH --account=your_project
#SBATCH --mem=16G

module load anaconda3

# Set TMPDIR to project directory to avoid using /tmp
export TMPDIR=/valhalla/projects/<your_project_name>/tmp
mkdir -p $TMPDIR

export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1
conda update --prefix ${VIRTUAL_ENV} --all -y

Alternative: Interactive update using srun

Instead of submitting a batch job, you can update packages interactively on a compute node using srun.

On the login node (login.discoverer.bg): Run the following command to start an interactive session on a compute node:

srun --job-name=update_packages \
     --time=01:00:00 \
     --ntasks-per-node=2 \
     --cpus-per-task=1 \
     --nodes=1 \
     --partition=cn \
     --account=your_project \
     --mem=16G \
     --pty bash

Once the interactive session starts, execute the update commands:

module load anaconda3

# Set TMPDIR to project directory to avoid using /tmp
export TMPDIR=/valhalla/projects/<your_project_name>/tmp
mkdir -p $TMPDIR

export VIRTUAL_ENV=/valhalla/projects/<your_project_name>/venv-1
conda update --prefix ${VIRTUAL_ENV} --all -y

Note: The --pty bash flag allocates a pseudo-terminal, allowing you to interact with the session. When you’re done, type exit to end the interactive session.

Remove environment

To remove an environment:

rm -rf /valhalla/projects/<your_project_name>/venv-1

Environment management guidelines

  1. Use project directories: Store environments in your project directory (e.g., /valhalla/projects/<your_project_name>/)
  2. Name environments clearly: Use descriptive names like venv-1, venv-numba, venv-scikit
  3. Document dependencies: Keep track of which packages you install for reproducibility
  4. Version control: Consider documenting your environment setup in your project documentation
  5. Test in SLURM: Always test your environment setup in a SLURM batch script before running large jobs
  6. Configure conda properly: Ensure conda is configured to use /valhalla for package cache and environments to avoid disk space issues

Additional Notes

  1. Cleanup: The temp directory on /valhalla will accumulate files over time. You can clean it periodically:

    rm -rf /valhalla/projects/<your_project_name>/tmp/*
    
  2. Disk space monitoring: To monitor which filesystem conda is trying to use, you can check conda’s verbose output:

    conda create --prefix /valhalla/projects/<your_project_name>/venv/llvmlite python3.12 -y -v
    
  3. Alternative: If you can’t modify TMPDIR, you could also try using mktemp to create a temp directory on /valhalla:

    TMPDIR=$(mktemp -d -p /valhalla/projects/<your_project_name>) conda create --prefix /valhalla/projects/<your_project_name>/venv/llvmlite python3.12 -y