InfiniCCL is a unified, cross-platform collective communication library designed to provide a unified interface for distributed deep learning and high-performance computing workloads across heterogeneous hardware architectures.
-
C++17-compatible compiler
-
CMake ≥ 3.18
-
Python ≥ 3.10
- Earlier versions MAY work but are not guaranteed
-
Involved Hardware-specific SDKs (e.g. CUDA Toolkit)
-
Involved Backend Libraries (e.g., OpenMPI, NCCL)
The participating nodes must meet the following requirements:
-
Network connectivity between all nodes
-
Passwordless SSH access
-
Consistent library and binary paths across the cluster (e.g., shared filesystem such as NFS).
- Note: Using inconsistent file paths across nodes has not yet been fully validated and may cause runtime failures during orchestration.
- Platform Agnostic: NCCL‑like, unified C API for all supported platforms.
- Heterogeneous Communication: Supports communication operations across different platforms and accelerators.
- Platform Auto-Detection: Automatically detects the underlying hardware without manual specification.
- Smart Orchestration: Includes
icclrun, a lightweight launcher for automated multi-node building, installation, and execution.
InfiniCCL can be built either manually with CMake or using the provided build.sh script for a single‑command, environment‑aware installation. Currently InfiniCCL only offers install-from-source option.
To start, download InfiniCCL using git:
# Clone the repository
git clone https://github.com/InfiniTensor/InfiniCCL.git
cd InfiniCCLThere are two methods to build InfiniCCL.
# Grant execution permission to the script
chmod +x ./scripts/build.sh
# Build and install to default location (~/.infini)
./scripts/build.shTo specify a custom install path (e.g., NFS shared directory)
INSTALL_PREFIX=/nfs/shared/infini ./scripts/build.shTo set any CMake flag or option, simply add it after the script (See the end of this section for all the available options)
# For instance, explicitly turn on `NCCL`
./scripts/build.sh -DWITH_NCCL=ONAfter the script completes, you will see output similar to:
========================================================
Starting InfiniCCL Build
Build Directory: build
Install Prefix: /nfs/shared/infini
========================================================
-- No backend specified. Defaulting to WITH_OMPI=ON
-- Auto-detecting available devices...
-- Auto-detected NVIDIA environment.
-- InfiniCCL Config: Devices [cpu, nvidia] | Backends [ompi]
...
-- Installing: /nfs/shared/infini/lib/libinfiniccl.so
...
========================================================
Build and Install Finished!
Binaries: /nfs/shared/infini/bin
Libraries: /nfs/shared/infini/lib
Headers: /nfs/shared/infini/include
========================================================
NOTE: If this is your first install, run: source ~/.bashrc
You can now run 'icclrun' from anywhere.
Note: Running source ~/.bashrc (or restarting your terminal) is required only once to add icclrun to your PATH.
InfiniCCL can be built and installed manually.
# Build and Install InfiniCCL
mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=/path/to/install [options]
cmake --build . -j $(nproc)
cmake --install .
# Configure the Enviroment Variables
# It is recommended to add these to your `~/.bashrc` for persistence.
export INFINICCL_ROOT="/path/to/InfiniCCL"
export PATH="/path/to/install/bin:$PATH"
export LD_LIBRARY_PATH="/path/to/install/lib:$LD_LIBRARY_PATH"Example Configurations: NVIDIA + OpenMPI
cmake .. -DWITH_NVIDIA=ON -DWITH_OMPI=ON📋 Full list of CMake options (Click to expand)
| Option | Description | Default |
|---|---|---|
| Device (Hardware) Options | ||
WITH_NVIDIA |
Enable NVIDIA GPU support | OFF |
WITH_METAX |
Enable MetaX GPU support | OFF |
WITH_CPU |
CPU support (always enabled) | ON (internal, not user‑settable) |
| Backend (Communication) Options | ||
WITH_OMPI |
Enable OpenMPI backend | ON if no backend specified, otherwise OFF |
WITH_NCCL |
Enable NCCL backend | OFF |
| Miscellaneous | ||
AUTO_DETECT_DEVICES |
Automatically detect available devices and enable corresponding support | ON |
BUILD_EXAMPLES |
Build internal example programs | ON |
These options are available in any CMake project and can be passed during configuration:
| Option | Description | Typical Default |
|---|---|---|
CMAKE_INSTALL_PREFIX |
Installation directory | /usr/local (set via -DCMAKE_INSTALL_PREFIX=... or environment) |
CMAKE_BUILD_TYPE |
Build type (Release, Debug, RelWithDebInfo, MinSizeRel) |
(empty, but build.sh sets Release) |
Notes:
AUTO_DETECT_DEVICESoverrides device options likeWITH_NVIDIAwhen detection succeeds.
To compile and execute any program that uses InfiniCCL, we can simply use icclrun from a single node to launch, compile, and execute the program across a cluster.
To achieve this, a cluster.yaml is required to be filled. This is the configuration file for icclrun. examples/cluster.yaml is an example template.
After having a successful build and a complete cluster.yaml, we are ready for compiling and executing a distributed program across the cluster.
To run an internal example program (e.g., examples/all_reduce.cc), just run:
# Build and run the executable across the cluster based on the config specified in your `cluster.yaml`
icclrun --config [path-to-your-cluster.yaml] --build all_reduce [program args]--config / -c: Path to the cluster YAML file.--build: Instructsicclrunto compile the library on each node before execution. If omitted,icclrunassumes the library is already installed at a consistent location.
The executable (e.g., all_reduce) and its arguments follow the options.
For more details, run:
icclrun --helpIf everything is correctly set up, the example program should have just been launched and executed across the cluster you specified in cluster.yaml!
To run and log multiple example programs consecutively, use the provided automated batch verification script scripts/run_examples.py. This tool runs multiple validation profiles, monitors exit codes, captures chronological trace logs, and optimizes away redundant compilation passes.
For the first time using the script, grant it execution permission:
# Grant execution permission to the script
chmod +x ./scripts/run_examples.shFor all the options of the script, see:
./run_examples --helpInfiniCCL does not yet provide a CMake config package (e.g., find_package(InfiniCCL)). Until then, the recommended way to link your own programs is to:
-
Set an environment variable
INFINICCL_INSTALLpointing to the InfiniCCL installation directory (e.g.,~/.infinior a shared NFS path), and/or set the installed library path toLD_LIBRARY_PATH. -
Link directly to the library, headers, and required dependencies (MPI, GPU runtime).
Within the program, just include:
#include <infiniccl/infiniccl.h>cmake_minimum_required(VERSION 3.18)
project(UserApp LANGUAGES CXX)
# Path to InfiniCCL installation (set via environment)
set(INFINI_INSTALL_DIR $ENV{INFINICCL_INSTALL})
# Your application
add_executable(app main.cc)
# Direct linking to InfiniCCL
target_include_directories(app PRIVATE "${INFINI_INSTALL_DIR}/include")
target_link_libraries(app
PRIVATE
"${INFINI_INSTALL_DIR}/lib/libinfiniccl.so"
# other necessary libraries
# ...
)
# RPATH to find `libinfiniccl.so` at runtime
set_target_properties(app PROPERTIES
BUILD_WITH_INSTALL_RPATH TRUE
INSTALL_RPATH "${INFINI_INSTALL_DIR}/lib"
)Then build and run with icclrun similar to internal examples:
icclrun --config [path-to-your-cluster.yaml] --build app [program args]If You are not using CMake, you can compile and link the program manually similar to other dependencies, for instance with these options:
-I${INFINI_INSTALL}/include \
-L${INFINI_INSTALL}/lib \
-Wl,-rpath,${INFINI_INSTALL}/lib \
-linfiniccl \If without -Wl, -rpath:
export LD_LIBRARY_PATH=${INFINI_INSTALL}/lib:$LD_LIBRARY_PATH📋 Supported runtime device platforms (Click to expand)
| Platform | Support Level | Notes |
|---|---|---|
| CPU | Partial | Runtime available, but no pure CPU collective operations yet. Planned for future releases. |
| NVIDIA | Full | Requires CUDA Toolkit. |
| MetaX | Full | Requires MACA SDK and MACA_PATH (default /opt/maca) to be set. |
📋 Supported communication backends (Click to expand)
| Backend | Support Level | Required CMake Option | Dependencies |
|---|---|---|---|
| OpenMPI | Full | WITH_OMPI=ON |
The default backend. Requires the OpenMPI development package. |
InfiniCCL is distributed under the LICENSE file in the root of the repository.
Contributions are welcome! Please refer to the contribution guidelines in CONTRIBUTING.md.