Databases
pg_mooncake is a Postgres extension that adds columnar storage and vectorized execution (DuckDB) for fast analytics within Postgres. Postgres + pg_mooncake ranks among the top 10 fastest in ClickBench.
Columnstore tables are stored as Iceberg or Delta Lake tables in local file system or cloud storage.
The extension is maintained by Mooncake Labs and is available on Neon Postgres.
Get started quickly with our Docker image:
docker pull mooncakelabs/pg_mooncake
# server
docker run --name mooncake-demo -e POSTGRES_HOST_AUTH_METHOD=trust -d mooncakelabs/pg_mooncake
# client
docker run -it --rm --link mooncake-demo:postgres mooncakelabs/pg_mooncake psql -h postgres -U postgres
Build for Postgres versions 14–17:
make release -j$(nproc)
make install
SET neon.allow_unstable_extensions='true';
CREATE EXTENSION pg_mooncake;
CREATE TABLE user_activity(
user_id BIGINT,
activity_type TEXT,
activity_timestamp TIMESTAMP,
duration INT
) USING columnstore;
INSERT INTO user_activity VALUES
(1, 'login', '2024-01-01 08:00:00', 120),
(2, 'page_view', '2024-01-01 08:05:00', 30),
(3, 'logout', '2024-01-01 08:30:00', 60),
(4, 'error', '2024-01-01 08:13:00', 60);
SELECT * from user_activity;
Columnstore tables behave just like regular Postgres heap tables, supporting transactions, updates, deletes, joins, and more.
Columnstore tables are stored in the local file system by default. You can configure mooncake.default_bucket
to store data in S3 or R2 buckets instead.
Note: On Neon, only cloud storage is supported. Neon users must bring their own S3 or R2 buckets or get a free S3 bucket by signing up at s3.pgmooncake.com. For cloud storage configuration instructions, see Cloud Storage. We are working to improve this experience.
pg_mooncake supports loading data from:
Find your columnstore table location:
SELECT * FROM mooncake.columnstore_tables;
The directory contains a Delta Lake (and soon Iceberg) table that can be queried directly using Pandas, DuckDB, Polars, or Spark.
^ File statistics and skipping should cover most use cases of partitioned tables in Postgres, including time series.