mem
bwa-mem3 mem aligns short DNA reads against an indexed reference genome
using the BWA-MEM algorithm. It accepts one or two FASTQ files (single-end or
paired-end) and writes alignments to stdout in SAM or BAM format. It is the
primary alignment subcommand; nearly all bwa-mem3 usage flows through it.
Synopsis
Usage: bwa-mem3 mem [options] <idxbase> <in1.fq> [in2.fq]
Options:
Algorithm options:
-o STR Output SAM file name
--bam[=N] Emit BAM instead of SAM text. N=0 (default) = uncompressed;
1..9 = BGZF deflate levels. Writes to stdout; redirect with `>`.
-t INT number of threads [1]
-k INT minimum seed length [19]
-w INT band width for banded alignment [100]
-d INT off-diagonal X-dropoff [100]
-r FLOAT look for internal seeds inside a seed longer than {-k} * FLOAT [1.5]
-y INT seed occurrence for the 3rd round seeding [20]
-c INT skip seeds with more than INT occurrences [500]
-D FLOAT drop chains shorter than FLOAT fraction of the longest overlapping chain [0.50]
-W INT discard a chain if seeded bases shorter than INT [0]
-m INT perform at most INT rounds of mate rescues for each read [50]
-S skip mate rescue
-P skip pairing; mate rescue performed unless -S also in use
Scoring options:
-A INT score for a sequence match, which scales options -TdBOELU unless overridden [1]
-B INT penalty for a mismatch [4]
-O INT[,INT] gap open penalties for deletions and insertions [6,6]
-E INT[,INT] gap extension penalty; a gap of size k cost '{-O} + {-E}*k' [1,1]
-L INT[,INT] penalty for 5'- and 3'-end clipping [5,5]
-U INT penalty for an unpaired read pair [17]
Input/output options:
-p smart pairing (ignoring in2.fq)
-R STR read group header line such as '@RG\tID:foo\tSM:bar' [null]
-H STR/FILE insert STR to header if it starts with @; or insert lines in FILE [null]
-j treat ALT contigs as part of the primary assembly (i.e. ignore <idxbase>.alt file)
-5 for split alignment, take the alignment with the smallest coordinate as primary
-q don't modify mapQ of supplementary alignments
-K INT process INT input bases in each batch regardless of nThreads (for reproducibility) []
-v INT verbose level: 1=error, 2=warning, 3=message, 4+=debugging [3]
-T INT minimum score to output [30]
-h INT[,INT] if there are <INT hits with score >80.00% of the max score, output all in XA [5,200]
-z FLOAT the fraction of the max score to use with -h [0.80]
-u output XB instead of XA; XB is XA with the alignment score and mapping quality added
-a output all alignments for SE or unpaired PE
-C append FASTA/FASTQ comment to SAM output
-V output the reference FASTA header in the XR tag
-Y use soft clipping for supplementary alignments
-M mark shorter split hits as secondary
-I FLOAT[,FLOAT[,INT[,INT]]]
specify the mean, standard deviation (10% of the mean if absent), max
(4 sigma from the mean if absent) and min of the insert size distribution.
FR orientation only. [inferred]
Bisulfite (--meth) options:
--meth enable inline bwameth-style C→T/G→A read conversion + meth-aware BAM
emission. Implies --bam. Requires the reference to have been built
with `bwa-mem3 index --meth` (emits ref.fa.bwameth.c2t).
--set-as-failed f|r
flag alignments to the matching strand ('f' or 'r') as QC-fail (0x200)
--do-not-penalize-chimeras
disable the longest-match <44% chimera heuristic (no 0x200 / MAPQ cap)
Supplementary MAPQ rescoring (fg-labs extension):
--supp-rep-hard-cap INT
force MAPQ=0 for supplementary alignments whose chain contains any seed
with >=INT genome occurrences (i.e. the supp region is repetitive on its
own). 0 disables (default). Typical values 5-20; lower = more aggressive.
Primary MAPQ is unaffected.
Help:
--help print this help message and exit
Note: Please read the man page for detailed description of the command line and options.
Common usage
Paired-end alignment, 16 threads, SAM to stdout:
bwa-mem3 mem -t 16 ref.fa R1.fq.gz R2.fq.gz > out.sam
Paired-end alignment, emit uncompressed BAM, pipe directly to samtools sort:
bwa-mem3 mem --bam -t 16 ref.fa R1.fq.gz R2.fq.gz \
| samtools sort -@ 8 -o out.bam -
samtools index out.bam
Paired-end methylation alignment with a read group header:
bwa-mem3 mem --meth -t 16 \
-R '@RG\tID:lib1\tSM:sample1\tPL:ILLUMINA' \
ref.fa R1.fq.gz R2.fq.gz \
| samtools sort -o out.bam -
Flag reference
Input / output
-o STR — output file
Write output to STR instead of stdout. Honored for both SAM and --bam
output; the path is opened lazily so BAM mode can hand it to htslib instead of
truncating it as a SAM-text file. Stdout redirection (>) remains an
alternative.
--bam[=N] — emit BAM
Emit BAM instead of SAM. N controls BGZF compression: 0 (default when
--bam is used without =) writes uncompressed BAM, which costs almost no
CPU and is the recommended mode for piping to samtools sort. Values 1–9
select increasing BGZF deflate levels; use --bam=6 or --bam=9 only when
writing directly to final storage without a downstream sort step.
Tip — Prefer –bam for production pipelines
Uncompressed BAM (
--bamor--bam=0) eliminates the text-formatting cost on the aligner side and the text-parse cost on thesamtools sortside. For any pipeline that immediately sorts or processes the output, this is faster than SAM at no quality cost.
-R STR — read group header
Injects a @RG header line and tags every alignment with RG:Z:<ID>. The
value is a tab-separated @RG line with literal \t escapes, for example:
-R '@RG\tID:run1\tSM:HG001\tPL:ILLUMINA\tLB:lib1'
bwa-mem3 escapes any literal tab characters inside -R values before writing
them to the @PG CL: field, preventing header corruption (fix for issue #45).
-H STR/FILE — extra header lines
If STR begins with @, it is injected verbatim as a header line. Otherwise
STR is treated as a path and every line in the file is injected. Useful for
adding @CO comments or custom @RG / @PG entries.
-p — smart pairing
Reads interleaved paired-end data from a single FASTQ file (in1.fq) rather
than two separate files. The second positional argument (in2.fq) is ignored.
-5 — leftmost-coordinate primary
For split alignments, designates the alignment with the smallest genomic coordinate as primary, rather than the longest alignment. Useful for some downstream tools that expect the leftmost alignment to be primary.
-q — preserve supplementary MAPQ
By default, bwa-mem3 may downgrade the MAPQ of supplementary alignments.
-q suppresses that adjustment.
-K INT — fixed batch size
Forces each thread batch to process exactly INT input bases regardless of
the number of threads. Useful when you need bit-for-bit reproducible output
across runs with different -t values: fix -K to the same value and the
output is deterministic.
-v INT — verbosity
Controls stderr diagnostic output: 1 = errors only, 2 = warnings,
3 = informational messages (default), 4+ = debugging.
-a — all alignments
Output all alignments for single-end or unpaired paired-end reads, including secondary alignments. Equivalent to enabling secondary-alignment reporting.
-C — append FASTA/FASTQ comment
Appends the comment field from the FASTA/FASTQ header to the SAM output as an additional column. Useful when the comment carries barcodes or UMIs.
-V — reference header in XR tag
Emits the reference FASTA header line for each alignment position as an XR
SAM tag.
-Y — soft-clip supplementary alignments
Uses soft clipping instead of hard clipping for supplementary alignments. Some downstream tools require this.
-M — mark shorter split hits as secondary
Marks the shorter alignment in a split read as secondary (sets 0x100 flag)
rather than supplementary. Required for compatibility with tools that do not
handle supplementary alignments (e.g. Picard’s duplicate-marking before
certain versions).
-j — treat ALT contigs as primary
Treats ALT contigs as part of the primary assembly by ignoring the
<idxbase>.alt file. Use when your workflow does not include ALT-aware
postprocessing.
Scoring
All scoring flags accept integer values. Changing -A (match score) scales
the penalty flags that default to multiples of -A; explicit overrides of
individual flags are unaffected.
| Flag | Default | Meaning |
|---|---|---|
-A INT | 1 | Score for a sequence match. Scales -T, -d, -B, -O, -E, -L, -U unless overridden. |
-B INT | 4 | Mismatch penalty. |
-O INT[,INT] | 6,6 | Gap open penalty for deletions and insertions respectively. |
-E INT[,INT] | 1,1 | Gap extension penalty per base. A gap of length k costs -O + -E * k. |
-L INT[,INT] | 5,5 | Clipping penalty for 5’ and 3’ ends. |
-U INT | 17 | Penalty for an unpaired read pair (affects mate-rescue scoring). |
-T INT | 30 | Minimum alignment score to output. Alignments below this threshold are not reported. |
Note — –meth overrides scoring defaults
When
--methis active, bwa-mem3 applies bwameth.py-compatible defaults:-B 2 -L 10 -U 100 -T 40 -CM. Any of these can still be overridden by passing the flag explicitly after--meth.
Paired-end
-I FLOAT[,FLOAT[,INT[,INT]]] — insert size distribution
Specifies the mean, standard deviation (default: 10% of mean), maximum (default: 4 sigma above mean), and minimum of the insert size distribution for FR-orientation paired-end reads. By default bwa-mem3 infers these parameters from the first batch of reads. Provide them explicitly for speed or when the reference is short and inference may be inaccurate.
-m INT — mate rescue rounds
Maximum number of mate-rescue attempts per read. Reduce to speed up alignment on data where the default (50) wastes time on unrescuable pairs.
-S — skip mate rescue
Disables mate rescue entirely. Faster but may reduce sensitivity for discordant pairs.
-P — skip pairing
Skips the pairing step; mate rescue still runs unless -S is also given.
Filtering
-c INT — skip repetitive seeds
Seeds with more than INT occurrences in the reference are skipped. Lowering
this (e.g. to 50) speeds up alignment of highly repetitive reads but may
reduce sensitivity. Raising it increases sensitivity in repeat-heavy regions
at a cost in runtime.
-D FLOAT — chain length fraction
Drops chains shorter than FLOAT times the longest overlapping chain. The
default (0.50) discards chains that are less than half the length of the best
chain.
-W INT — minimum seeded bases
Discards chains with fewer than INT seeded bases. Raising this filters out
very short, low-confidence chains.
-h INT[,INT] — secondary alignment reporting
If there are fewer than INT hits with score exceeding FLOAT (see -z)
times the maximum score, all of them are output in the XA auxiliary tag.
The second integer is a hard cap on the number of XA entries. Defaults: 5, 200.
-z FLOAT — secondary score fraction
Fraction of the maximum alignment score used as the threshold for secondary
hit reporting with -h. Default: 0.80.
-u — emit XB instead of XA
Outputs XB in place of XA. XB is an extension of XA that also carries
the alignment score and mapping quality for each secondary hit.
Methylation (--meth)
--meth — enable bisulfite alignment mode
Activates inline C→T (R1) and G→A (R2) read conversion, bwameth-compatible
scoring defaults, inline BAM post-processing, and forces --bam output.
The reference must have been indexed with bwa-mem3 index --meth.
Pass the original FASTA prefix as <idxbase> — the .bwameth.c2t suffix is
appended automatically. If <idxbase> already ends in .bwameth.c2t
(interop with an external c2t converter), the auto-append is skipped.
See Methylation Reference for the full treatment.
--set-as-failed {f|r} — strand QC-fail flag
Forces the QC-fail bit (0x200) on all alignments to the forward (f) or
reverse (r) bisulfite strand. Used when one strand is known to be
unreliable for a given library preparation.
--do-not-penalize-chimeras — disable chimera heuristic
Disables the longest-match < 44% chimera heuristic that would otherwise set
0x200, clear 0x2, and cap MAPQ at 1 for likely chimeric alignments.
Use when the default chimera filter is too aggressive for your library type.
Threading
-t INT — number of threads
Number of worker threads. Defaults to 1. Set to the number of physical cores available to this job. Scaling is workload- and hardware-dependent: on typical machines the curve flattens around 16–32 threads (FM-index bandwidth and I/O contention dominate); on high-memory / fast-I/O servers the aligner can keep scaling toward ~64 threads on hg38 before saturating. See the threading guide for measured guidance and per-machine recommendations.
See User Guide — Threading and resource use for guidance on thread counts at various machine sizes.
Supplementary MAPQ rescoring
--supp-rep-hard-cap INT — cap MAPQ for repetitive supplementary alignments
Forces MAPQ=0 for supplementary alignments whose chain contains any seed with
at least INT occurrences in the genome. This targets supplementary
alignments anchored in repetitive regions that upstream MAPQ scoring may
overestimate. 0 disables the cap (default). Typical values are 5–20; lower
values are more aggressive. Primary alignment MAPQ is unaffected.
Debug
-k INT — minimum seed length
Minimum exact-match seed length. Shorter seeds increase sensitivity but raise runtime. The default (19) is calibrated for 100–150 bp Illumina reads.
-w INT — band width
Band width for the banded Smith-Waterman extension. Wider bands can recover alignments with long indels at greater CPU cost.
-d INT — X-dropoff
Off-diagonal X-dropoff for the Z-drop heuristic. Controls how far an alignment extension continues after a score drop.
-r FLOAT — re-seeding factor
Seeds longer than -k * FLOAT are re-seeded internally to find sub-seeds.
Lowering this produces more seeds and higher sensitivity at greater cost.
-y INT — third-round seed occurrence threshold
Seed occurrence threshold for the third round of seeding. Rarely needs adjustment outside highly repetitive genomes.
Notes / Gotchas
Warning — –meth requires a –meth index
Running
bwa-mem3 mem --methagainst a standard (non-c2t) index produces incorrect alignments without an error. Confirm that the index was built withbwa-mem3 index --methbefore aligning bisulfite data.Note — SIMD variant printed to stderr at startup
When mem starts it prints a banner (
Executing in AVX512 mode!!etc.) to stderr. This is informational and does not affect stdout output.
See also: User Guide — Aligning short reads · User Guide — Output: SAM/BAM, headers, tags · CLI Reference — index · Methylation Reference — Overview · Best Practices — Output format