LMDB (Lightning Memory-Mapped Database)
LMDB is an embedded, memory-mapped, B+tree key-value store written in C by Howard Chu of OpenLDAP. It is one of the fastest embedded databases ever built — reads come straight out of the OS page cache via mmap with zero copying, and writes are committed via a single fsync per transaction. LMDB underpins OpenLDAP’s back-mdb backend, the Bitcoin Core block index, monero, several Python ML data loaders, and many other systems where read latency matters more than feature richness.
Key Features:
- Memory-Mapped. The entire database is mapped into the process address space. Reads are pointer dereferences — no syscalls, no copies.
- B+tree Storage. Sorted keys, range scans by traversal, predictable read amplification.
- Single-Writer MVCC. One writer transaction at a time; arbitrarily many concurrent readers, none blocked. Readers see a consistent snapshot.
- Crash-Safe. Append-only updates with copy-on-write pages; a power loss leaves the database at a consistent prior state.
- Tiny Code Base. ~10K lines of C; extensively audited.
- Sub-Microsecond Reads. Cache-warm reads under 1 µs; orders of magnitude faster than LevelDB / RocksDB on read-mostly workloads.
LMDB vs. LevelDB / RocksDB:
- LMDB. B+tree, mmap-based, read-optimized. Database size limited to address space (effectively unlimited on 64-bit). Single-writer.
- LevelDB / RocksDB. LSM-tree, write-optimized. Higher write throughput, more tuning surface, more features.
- Choose LMDB for read-heavy embedded workloads; choose LevelDB / RocksDB for write-heavy or feature-rich needs.
Use Cases:
- OpenLDAP back-mdb — canonical use, replaced BerkeleyDB.
- Bitcoin Core block index and chainstate — deterministic reads at high throughput.
- ML training data loaders — pre-encoded features in LMDB read directly into PyTorch / TensorFlow batches.
- Embedded indexes inside larger applications where mmap reads are dominant.