SPECIMEN · 011 · DUCKDB-WASM
PROJECT LAVOS · SPECIMEN SERIES
Numeri, in loco. NUMBERS, IN PLACE · STAGE VII · DATA

the column.

DUCKDB-WASM · OLAP · COLUMNAR · ~6 MB · IN-BROWSER
Engine
loading…
Table
events
Rows
Last query
— ms
Result rows
Round-trips
0
FIG. 1 · SQL EDITOR · LIVE QUERY
awaiting engine init…
query.sql · read-only schema · events(id, value, level, region, ts) cmd / ctrl + enter to run
initializing duckdb…
No result yet — click Run query once the engine is ready, or pick one of the samples below.
rows · cols · scanned
samples
§ I

The whole database, in your tab.

DuckDB is an analytical SQL database in the family of Postgres, ClickHouse, and BigQuery — built around columnar storage and vectorised execution. It was written in C++ to live inside other programs, the way SQLite does, except it is optimised for slicing aggregations across millions of rows rather than transactional reads of single ones.

In 2021 the DuckDB team began compiling it to WebAssembly. The result is the engine you just loaded: a real OLAP database, six megabytes on the wire, running in a Web Worker on this page. The synthetic events table above was generated by DuckDB itself in the time it took you to scroll past the title — half a million rows, five columns, no network calls. Every query you write hits the engine in your tab.

A real database. In a tab. No server.

§ II

Why the column is the unit.

A traditional SQL database stores rows next to each other on disk. That layout is good for reading one row at a time — the case for transactional applications, where you fetch a customer, an order, an account. It is the wrong layout for analytics, where you almost always want to read one or two columns across every row. Forcing the database to skip past the irrelevant columns of every row is a tax that scales with the table.

DuckDB stores each column as a contiguous array, like Apache Parquet, like Arrow. When you ask SELECT level, COUNT(*) FROM events, the engine reads only the level column — the other four are never touched. Vectorised execution amortises the per-row cost across thousands of values at a time. This is why the 500,000-row aggregation above completes in single-digit milliseconds on hardware you already own.

§ III

What a server isn't doing.

Read the network panel of your browser's developer tools while you query the table above. You will see the WebAssembly module load once, on first paint, and you will not see another request after that. The query is sent into a Web Worker as a string; the result comes back as an Arrow table. The data set never leaves the page. The query never reaches an analyst's laptop, or a vendor's audit log, or the public internet.

This is not the canonical web architecture. The canonical architecture has data on a server and a client that paints what the server returns. DuckDB-WASM points at a different shape: send the engine to the data, not the other way around. The events table is here because you generated it; tomorrow it will be your CSV, or your Parquet file, or the JSON the API gave you. The pattern is the thing.

Send the engine to the data.

the data didn't leave.
SPECIMEN 011 · MMXXVI
PROJECTLAVOS-BAUHAUS
../specimens