pg_duckdb: Official Postgres extension for DuckDB

pg_duckdb is a Postgres extension that embeds DuckDB's columnar-vectorized analytics engine and features into Postgres. We recommend using pg_duckdb to build high performance analytics and data-intensive applications.

pg_duckdb was developed in collaboration with our partners, Hydra and MotherDuck.

Installation

Pre-built binaries and additional installation options are coming soon.

To build pg_duckdb, you need:

Postgres 16
Ubuntu 22.04 or MacOS
Standard set of build tools for building Postgres extensions
Build tools that are required to build DuckDB

To build and install, run:

make install

Next, load the pg_duckdb extension:

CREATE EXTENSION pg_duckdb;

IMPORTANT: Once loaded you can use DuckDB execution by running SET duckdb.execution TO true. This is opt-in to avoid breaking existing queries. To avoid doing that for every session, you can configure it for a certain user by doing ALTER USER my_analytics_user SET duckdb.execution TO true.

Features

SELECT queries executed by the DuckDB engine can directly read Postgres tables.
- Able to read data types that exist in both Postgres and DuckDB. The following data types are supported: numeric, character, binary, date/time, boolean, uuid, json, and arrays.
- If DuckDB cannot support the query for any reason, execution falls back to Postgres.
Read parquet and CSV files from object storage (AWS S3, Cloudflare R2, or Google GCS).
- SELECT n FROM read_parquet('s3://bucket/file.parquet') AS (n int)
- SELECT n FROM read_csv('s3://bucket/file.csv') AS (n int)
- You can pass globs and arrays to these functions, just like in DuckDB
Enable the DuckDB Iceberg extension using SELECT duckdb.enable_extension('iceberg') and read Iceberg files with iceberg_scan.
Write a query — or an entire table — to parquet in object storage.
- COPY (SELECT foo, bar FROM baz) TO 's3://...'
- COPY table TO 's3://...'

Read and write to Parquet format in a single query

 COPY (
   SELECT count(*), name
   FROM read_parquet('s3://bucket/file.parquet') AS (name text)
   GROUP BY name
   ORDER BY count DESC
 ) TO 's3://bucket/results.parquet';

Query and JOIN data in object storage with Postgres tables, views, and materialized views.
Create indexes on Postgres tables to accelerate your DuckDB queries
Install DuckDB extensions using SELECT duckdb.install_extension('extension_name');
Toggle DuckDB execution on/off with a setting:
- SET duckdb.execution = true|false

Getting Started

The best way to get started is to connect Postgres to a new or existing object storage bucket (AWS S3, Cloudflare R2, or Google GCS) with pg_duckdb. You can query data in Parquet, CSV, and Iceberg format using read_parquet, read_csv, and iceberg_scan respectively.

Add a credential to enable DuckDB's httpfs support.

INSERT INTO duckdb.secrets
(cloud_type, cloud_id, cloud_secret, cloud_region)
VALUES ('S3', 'access_key_id', 'secret_accss_key', 'us-east-1');

Copy data directly to your bucket - no ETL pipeline!

COPY (SELECT user_id, item_id, price, purchased_at FROM purchases)
TO 's3://your-bucket/purchases.parquet;

Perform analytics on your data.

SELECT SUM(price) AS total, item_id
FROM read_parquet('s3://your-bucket/purchases.parquet')
  AS (price float, item_id int)
GROUP BY item_id
ORDER BY total DESC
LIMIT 100;

Roadmap

Please see the project roadmap for upcoming planned tasks and features.

Connect with MotherDuck

pg_duckdb integration with MotherDuck will enable hybrid execution with Differential Storage.

Zero-copy snapshots and forks
Time travel
Data tiering
Improved concurrency and cacheability

Contributing

pg_duckdb was developed in collaboration with our partners, Hydra and MotherDuck. We look forward to their continued contributions and leadership.

Hydra is a Y Combinator-backed database company, focused on DuckDB-Powered Postgres for app developers.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github/workflows		.github/workflows
include/pgduckdb		include/pgduckdb
sql		sql
src		src
test/regression		test/regression
third_party		third_party
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
Makefile.global		Makefile.global
README.md		README.md
pg_duckdb.control		pg_duckdb.control

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pg_duckdb: Official Postgres extension for DuckDB

Installation

Features

Getting Started

Roadmap

Connect with MotherDuck

Contributing

Resources

About

Releases

Packages

Languages

License

japinli/pg_duckdb

Folders and files

Latest commit

History

Repository files navigation

pg_duckdb: Official Postgres extension for DuckDB

Installation

Features

Getting Started

Roadmap

Connect with MotherDuck

Contributing

Resources

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages