Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Building from source

This page documents every build target available in the Makefile and what each produces. For the recommended production build workflow see Best Practices → Build.

Prerequisites

  • A C++14-capable compiler: GCC 7+ or Clang 6+ on Linux; Clang 15+ (Xcode) on macOS.
  • GNU make 3.81+.
  • CMake 3.12+ (required only when USE_MIMALLOC=1, which is the default).
  • autoconf, automake, autoconf-archive, libtool, pkg-config — ext/htslib’s build runs autoreconf -i && ./configure and locates zlib via pkg-config.
  • zlib development headers — htslib links against zlib.
  • OpenMP runtime — libsais uses OpenMP for parallel suffix-array construction. Linux + GCC: libgomp ships with the compiler, nothing extra to install. Linux + Clang: libomp-dev (Debian) / libomp-devel (RHEL). macOS: brew install libomp; the Makefile auto-detects the Homebrew prefix or honours LIBOMP_PREFIX.
  • Git submodules initialised: git submodule update --init --recursive.

See Getting Started → Installation for the full per-platform install commands.

Warning — Submodules must be present

The build will fail with a clear error message if any of the required submodules (ext/libsais, ext/htslib, ext/safestringlib, ext/mimalloc, ext/sse2neon) are missing. Always clone with --recursive or run git submodule update --init --recursive before make.

Standard builds

Default build (host-native)

make

On x86 hosts this is equivalent to make single (see below): one binary containing all five SIMD tiers, dispatched in process at startup. On Apple Silicon and other aarch64 hosts the Makefile detects the architecture and builds a single ARM64 binary with one NEON kernel TU.

The resulting binary is bwa-mem3 in the repo root.

Single multi-tier x86 build (default on x86)

make single                       # alias of the default `make`
make BASELINE_ARCH=avx512bw       # raise non-kernel TU compile baseline
make BASELINE_ARCH=sse41          # lower it for pre-Haswell hosts

Builds one bwa-mem3 binary. The four hand-tuned kernel TUs in KERNEL_SRCS (bandedSWA.cpp, kswv.cpp, ksw.cpp, sam_encode.cpp) are compiled five times each — once per supported tier (sse41 / sse42 / avx / avx2 / avx512bw) — and dispatched at runtime via __builtin_cpu_supports. Non-kernel TUs compile once at BASELINE_ARCH (default avx2 since PR #84). See Single-binary SIMD dispatch (x86) for the full design.

Single-tier x86 builds

Pass arch=<target> to compile a single binary with kernels for one tier only (no runtime dispatch table — useful on clusters with uniform hardware):

CommandSIMD levelARCH_FLAGS
make arch=sse41SSE4.1-msse … -msse4.1
make arch=sse42SSE4.2-msse … -msse4.2
make arch=avxAVX-mavx
make arch=avx2AVX2-mavx2
make arch=avx512bwAVX-512BW-mavx512f -mavx512bw -mprefer-vector-width=256
make arch=nativehost CPU features-march=native

For Intel compiler (icpc / icpx) the flags differ slightly; see the Makefile for the ifeq ($(CXX), icpc) branches. The avx512bw target keeps the -mprefer-vector-width=256 cap from PR #86 — see BASELINE_ARCH=avx512bw build flag for the empirical perf characterization.

ARM64 / Apple Silicon build

make arch=arm64

Compiles a single binary bwa-mem3 with one NEON kernel TU. See Apple Silicon / NEON port for background.

Tuned builds

Profile-Guided Optimization (PGO)

PGO produces the best single-binary performance. The workflow is two-phase:

# Phase 1: instrument binary
make pgo-generate                              # builds bwa-mem3.pgo-instr (arm64 default)
make pgo-generate PGO_ARCH=avx2               # or a specific x86 target

# Run your training workload with the instrumented binary
./bwa-mem3.pgo-instr mem -t 16 ref.fa r1.fq.gz r2.fq.gz > /dev/null

# Phase 2: optimised binary
make pgo-use                                   # builds bwa-mem3.pgo
make pgo-use PGO_ARCH=avx2                     # matching arch

PGO_ARCH accepts the same values as arch=. PGO_PROFILE_DIR defaults to pgo_profiles/ but can be overridden. Output binaries are named bwa-mem3.pgo (default arch) or bwa-mem3.pgo.<arch> when a non-default arch is specified, so multiple arch builds coexist.

Clean up instrumented objects and profile data:

make pgo-clean
make lto-build                                 # builds bwa-mem3.lto (native arch)
make lto-build LTO_ARCH=avx2                   # explicit arch

LTO compiles bwa-mem3’s own translation units with -flto (thin LTO on Clang, full LTO on GCC) plus -fno-semantic-interposition on GCC. Third-party libraries (htslib, mimalloc, safestringlib) are linked without LTO. Clean:

make lto-clean

Compute-only profile binary

Used when profiling CPU hotspots without I/O noise. The -DDISABLE_OUTPUT flag short-circuits all BAM/SAM write paths and the file-open / header-emit step, so only alignment work contributes to wall time.

make profile-build                             # builds bwa-mem3.profile (native)
make profile-build PROFILE_ARCH=avx2          # explicit arch
./bwa-mem3.profile mem -t 16 ref.fa r1.fq.gz r2.fq.gz

make profile-clean

Build knobs

VariableDefaultEffect
USE_MIMALLOC1Include mimalloc; set 0 to use the system allocator
ASAN(unset)Set to any non-empty value to enable AddressSanitizer (forces USE_MIMALLOC=0)
COVERAGE(unset)Set to enable --coverage + -O0 for gcov line-level coverage
EXTRA_CXXFLAGS(empty)Appended to CXXFLAGS; forwarded through PGO / LTO targets
DISABLE_BATCHED_MATESW(unset)Set to 1 to disable the batched mate-rescue SW path on ARM
CXXc++Compiler. Paired CC is auto-derived from CXX for libsais.

Cleaning

make clean

Removes object files, libbwa.a, all binaries, test binaries, libsais objects, safestringlib, htslib, and the mimalloc build tree.

make docs-clean

Removes only the mdbook build output (docs/book/). Covered in Developer Guide → Building context; see the Makefile docs targets for the full list.

Documentation targets

TargetAction
make docsBuild the mdbook into docs/book/
make docs-serveLive-preview at http://localhost:3000
make docs-cliCapture --help output for each subcommand into docs/_generated/cli/
make docs-cleanRemove docs/book/
make docs-install-toolscargo install mdbook + three plugins

See also: SIMD dispatch architecture · Single-binary SIMD dispatch (x86) · Best Practices → Build · Performance → PGO build · Apple Silicon / NEON port