LZ4

Extremely fast lossless compression algorithm

Overview

LZ4 is an extremely fast lossless compression algorithm, providing compression speeds greater than 500 MB/s per core, with scalability across multi-core CPUs. It is designed to provide very fast compression and decompression while maintaining reasonable compression ratios, making it ideal for applications that prioritize speed over maximum compression.

LZ4 is particularly well-suited for real-time compression scenarios, in-memory compression, network transmission, and storage systems where fast compression and decompression are critical. The algorithm achieves its high performance through a focus on speed-optimized implementations and efficient memory access patterns.

Available versions

To view available lz4 versions:

module avail lz4

Build recipes and configuration details are maintained in our GitLab repository:

Build optimizations

Our LZ4 installations are optimized for maximum performance on Discoverer’s hardware. We use the recent LLVM Compiler Infrastructure compilers to build the LZ4 library code, which are the default compilers on Discoverer Petascale Supercomputer.

Compiler optimizations:

  • Link Time Optimization (LTO): Full LTO (-flto=full) is enabled for both compilation and linking, allowing cross-module optimizations that significantly improve performance.
  • CPU-Specific Optimizations: - -march=native: Optimizes for the native CPU architecture, enabling all available instruction sets - -mtune=native: Tunes the generated code specifically for the target CPU - -mfma: Enables FMA (Fused Multiply-Add) instructions for improved floating-point performance
  • Position Independent Code: -fPIC is used to enable shared library support.

Linker optimizations:

  • LLD Linker: We use LLVM’s LLD linker for faster linking and better optimization support.
  • LTO at Link Time: Full link-time optimization enables whole-program optimizations.

Build configuration:

  • Release Build: All optimizations are enabled for production use.
  • Multi-threading: Built-in support for multi-threaded compression and decompression.
  • Full Feature Set: All compression modes and features are enabled, providing maximum flexibility.
  • Dual Library Builds: Both shared (.so) and static (.a) libraries are built and installed, providing flexibility for different use cases.

These optimizations ensure that our LZ4 installation provides the fastest possible compression and decompression performance for CPU-based applications on Discoverer, while maintaining full compatibility with the standard LZ4 API.

Available libraries

LZ4 provides the liblz4 shared library that is installed by default:

liblz4.so - LZ4 compression library

This library implements the LZ4 compression algorithm, providing extremely fast lossless compression and decompression with reasonable compression ratios.

  • Header file: lz4.h
  • Link flag: -llz4
  • pkg-config: liblz4

Note

The library uses optimized implementations and can be used in both C and C++ applications. It is particularly effective for real-time compression scenarios and in-memory compression operations.

Library variants

The liblz4 library is available as both static (.a) and shared (.so) libraries. The Environment Modules automatically configure the appropriate paths for dynamic linking, which is the recommended approach for HPC environments.

Shared libraries (recommended):
  • liblz4.so is used by default
  • Automatically configured when loading the module
  • Recommended for HPC environments
Static libraries:
  • liblz4.a is also available
  • Use only if your application specifically requires static linking
  • Requires explicit -static flag during linking

Linking your application

After loading the lz4 module, the environment variables are automatically configured. You can link your application using one of the following methods:

Method 1: Using environment variables (recommended)

# Load the module first
module load lz4/<version>

# Link against liblz4 - C code
gcc -o myapp myapp.c $CFLAGS $LDFLAGS -llz4
clang -o myapp myapp.c $CFLAGS $LDFLAGS -llz4

# Link against liblz4 - C++ code
g++ -o myapp myapp.cpp $CXXFLAGS $LDFLAGS -llz4
clang++ -o myapp myapp.cpp $CXXFLAGS $LDFLAGS -llz4

Method 2: Using pkg-config

# Load the module first
module load lz4/<version>

# Link against liblz4 - C code
gcc -o myapp myapp.c $(pkg-config --cflags --libs liblz4)
clang -o myapp myapp.c $(pkg-config --cflags --libs liblz4)

# Link against liblz4 - C++ code
g++ -o myapp myapp.cpp $(pkg-config --cflags --libs liblz4)
clang++ -o myapp myapp.cpp $(pkg-config --cflags --libs liblz4)

Method 3: Manual linking

# Load the module first
module load lz4/<version>

# Link against liblz4 - C code
gcc -o myapp myapp.c -I$LZ4_ROOT/include -L$LZ4_ROOT/lib64 -llz4
clang -o myapp myapp.c -I$LZ4_ROOT/include -L$LZ4_ROOT/lib64 -llz4

# Link against liblz4 - C++ code
g++ -o myapp myapp.cpp -I$LZ4_ROOT/include -L$LZ4_ROOT/lib64 -llz4
clang++ -o myapp myapp.cpp -I$LZ4_ROOT/include -L$LZ4_ROOT/lib64 -llz4

Static linking (if required):

If your application specifically requires static linking:

# C code
gcc -o myapp myapp.c $CFLAGS $LDFLAGS -llz4 -static
clang -o myapp myapp.c $CFLAGS $LDFLAGS -llz4 -static

# C++ code
g++ -o myapp myapp.cpp $CXXFLAGS $LDFLAGS -llz4 -static
clang++ -o myapp myapp.cpp $CXXFLAGS $LDFLAGS -llz4 -static

Note

The Environment Modules automatically set CFLAGS, CXXFLAGS, and LDFLAGS when you load the module. Using these variables is the recommended approach as they remain correct even if the module path changes.

Command-line utilities

LZ4 provides command-line utilities for compression and decompression. After loading the lz4 module, these utilities are available in your PATH.

Main compression/decompression tools:

lz4 - Main compression and decompression tool

The primary utility for compressing and decompressing files in the .lz4 format.

  • Compresses files to .lz4 format
  • Decompresses .lz4 files
  • Supports various compression levels (1-19)
  • Supports multi-threading for faster compression

Convenience tools (symlinks to lz4):

Several tools are implemented as symlinks to the main lz4 binary. The lz4 program detects which name it was invoked under (using argv[0]) and adjusts its behavior accordingly. This polymorphism allows one binary to provide multiple interfaces:

lz4cat -> lz4
Decompresses .lz4 files to standard output (equivalent to lz4 -dc). Useful for piping decompressed data to other commands.
unlz4 -> lz4
Decompresses .lz4 files (equivalent to lz4 -d). Provides an intuitive name for decompression operations.

How polymorphism works:

The LZ4 utilities use a common Unix pattern called “name-based polymorphism” or “argv[0] polymorphism”. When a program is invoked, the operating system passes the program name as the first argument (argv[0]). The lz4 binary checks this name to determine its behavior:

  • If invoked as lz4, it compresses/decompresses .lz4 files based on command-line options
  • If invoked as lz4cat, it decompresses to stdout
  • If invoked as unlz4, it forces decompression mode

This design allows: - Space efficiency: One binary provides multiple tools - Consistency: All tools share the same core implementation and behavior - Flexibility: Users can choose the most intuitive name for their task

All symlinks point to the same lz4 binary, which adapts its behavior based on how it was invoked. This is why you can use lz4cat or unlz4 and they all work correctly despite being the same underlying program.

Example usage:

# Load the module
module load lz4/<version>

# Compress a file
lz4 myfile.txt              # Creates myfile.txt.lz4

# Decompress a file
unlz4 myfile.txt.lz4        # Restores myfile.txt
# or
lz4 -d myfile.txt.lz4       # Restores myfile.txt
# or
lz4cat myfile.txt.lz4       # Decompresses to stdout

# Compress with specific level
lz4 -9 myfile.txt           # Maximum compression

# Compress with multi-threading
lz4 -B4 myfile.txt          # Use 4 threads

# Pipe decompressed output
lz4cat archive.lz4 | grep pattern

Warning

When processing large files or multiple files, use Slurm batch jobs to execute these utilities on compute nodes rather than login nodes.

Getting help

For additional assistance: