https://github.com/apecloud/myduckserver

Unified MySQL, Postgres & FlightSQL Server, Powered by DuckDB.

https://github.com/apecloud/myduckserver

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.8%) to scientific vocabulary

Keywords

analytics arrow business-analytics business-intelligence columnar-storage data-engineering data-science database duckdb htap mariadb mysql olap pandas parquet polars postgres replication sql zero-etl

Keywords from Contributors

sequences projection interactive serializer measurement cycles packaging charts network-simulation modular
Last synced: 5 months ago · JSON representation

Repository

Unified MySQL, Postgres & FlightSQL Server, Powered by DuckDB.

Basic Info
  • Host: GitHub
  • Owner: apecloud
  • License: apache-2.0
  • Language: Go
  • Default Branch: main
  • Homepage: https://myduck.io/TODO
  • Size: 5.67 MB
Statistics
  • Stars: 438
  • Watchers: 14
  • Forks: 20
  • Open Issues: 37
  • Releases: 0
Topics
analytics arrow business-analytics business-intelligence columnar-storage data-engineering data-science database duckdb htap mariadb mysql olap pandas parquet polars postgres replication sql zero-etl
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme Contributing License

README.md

duck under dolphin MyDuck Server

MyDuck Server unlocks serious power for your MySQL & Postgres analytics. Imagine the simplicity of (MySQL|Postgres)’s familiar interface fused with the raw analytical speed of DuckDB. Now you can supercharge your analytical queries with DuckDB’s lightning-fast OLAP engine, all while using the tools and dialect you know.

duck under dolphin

📑 Table of Contents

❓ Why MyDuck ❓

While MySQL and Postgres are the most popular open-source databases for OLTP, their performance in analytics often falls short. DuckDB, on the other hand, is built for fast, embedded analytical processing. MyDuck Server lets you enjoy DuckDB's high-speed analytics without leaving the (MySQL|Postgres) ecosystem.

With MyDuck Server, you can:

  • Set up an isolated, fast, and real-time replica dedicated to ad-hoc analytics, batch jobs, and LLM-generated queries, without exhausting or corrupting your primary database 🔥
  • Accelerate existing MySQL & Postgres analytics to new heights through DuckDB's high-speed engine with minimal changes 🚀
  • Enable richer & faster connectivity between modern data manipulation & analysis tools and your MySQL & Postgres data 🛠️
  • Go beyond MySQL & Postgres syntax with DuckDB's advanced SQL features to expand your analytics potential 🦆
  • Run DuckDB in server mode to share a DuckDB instance with your team or among your applications 🌩️
  • Build HTAP systems by combining (MySQL|Postgres) for transactions with MyDuck for analytics 🔄
  • and much more! See below for a full list of feature highlights.

MyDuck Server isn't here to replace MySQL & Postgres — it's here to help MySQL & Postgres users do more with their data. This open-source project provides a convenient way to integrate high-speed analytics into your workflow while embracing the flexibility and efficiency of DuckDB.

✨ Key Features

  • Blazing Fast OLAP with DuckDB: MyDuck stores data in DuckDB, an OLAP-optimized database known for lightning-fast analytical queries. DuckDB enables MyDuck to execute queries up to 1000x faster than traditional MySQL & Postgres setups, making complex analytics practical that were previously unfeasible.

  • MySQL-Compatible Interface: MyDuck implements the MySQL wire protocol and understands MySQL syntax, allowing you to connect with any MySQL client and run MySQL-style SQL. MyDuck automatically translates your queries and executes them in DuckDB.

  • Postgres-Compatible Interface: MyDuck implements the Postgres wire protocol, enabling you to send DuckDB SQL directly using any Postgres client. Since DuckDB's SQL dialect closely resembles PostgreSQL, you can speed up existing Postgres queries with minimal changes.

  • Raw DuckDB Power: MyDuck provides full access to DuckDB's analytical capabilities through raw DuckDB SQL, including friendly SQL syntax, advanced aggregates, remote data source access, nested data types, and more.

  • Zero-ETL: Simply start replication and begin querying! MyDuck can function as a MySQL replica or Postgres standby, replicating data from your primary server in real-time. It works like standard MySQL & Postgres replication - using MySQL's START REPLICA or Postgres' CREATE SUBSCRIPTION commands, eliminating the need for complex ETL pipelines.

  • Consistent and Efficient Replication: Thanks to DuckDB's solid ACID support, we've carefully managed transaction boundaries in the replication stream to ensure a consistent data view — you'll never see dirty data mid-transaction. Plus, MyDuck's transaction batching collects updates from multiple transactions and applies them to DuckDB in batches, significantly reducing write overhead (since DuckDB isn’t designed for high-frequency OLTP writes).

  • HTAP Architecture Support: MyDuck works well with database proxy tools to enable hybrid transactional/analytical processing setups. You can route DML operations to (MySQL|Postgres) and analytical queries to MyDuck, creating a powerful HTAP architecture that combines the best of both worlds.

  • Bulk Upload & Download: MyDuck supports fast bulk data loading from the client side with the standard MySQL LOAD DATA LOCAL INFILE command or the PostgreSQL COPY FROM STDIN command. You can also extract data from MyDuck using the PostgreSQL COPY TO STDOUT command.

  • End-to-End Columnar IO: In addition to the traditional row-oriented data transfer in MySQL & Postgres protocol, MyDuck can also send query results and receive data uploads in columnar format, which can be significantly faster for high-volume data. This is implemented on top of the standard Postgres COPY protocol with extended columnar format support, e.g., COPY ... TO STDOUT (FORMAT parquet | arrow), allowing you to use the standard Postgres client library to interact with MyDuck in an optimized way.

  • Standalone Mode: MyDuck can run in standalone mode without replication. In this mode, it is a drop-in replacement for (MySQL|Postgres), but with a DuckDB heart. You can CREATE TABLE, transactionally INSERT, UPDATE, and DELETE data, and run blazingly fast SELECT queries.

  • DuckDB in Server Mode: If you aren't interested in MySQL & Postgres but just want to share a DuckDB instance with your team or among your applications, MyDuck is also a great solution. You can deploy MyDuck to a server, connect to it with the Postgres client library in your favorite programming language, and start running DuckDB SQL queries directly.

  • Seamless Integration with Dump & Copy Utilities: MyDuck plays well with modern MySQL & Postgres data migration tools, especially the MySQL Shell and pg_dump. For MySQL, you can load data into MyDuck in parallel from a MySQL Shell dump, or leverage the Shell’s copy-instance utility to copy a consistent snapshot of your running MySQL server to MyDuck. For Postgres, MyDuck can load data from a pg_dump archive.

📊 Performance

Typical OLAP queries can run up to 1000x faster with MyDuck Server compared to MySQL & Postgres alone, especially on large datasets. Under the hood, it's just DuckDB doing what it does best: processing analytical queries at lightning speed. You are welcome to run your own benchmarks and prepare to be amazed! Alternatively, you can refer to well-known benchmarks like the ClickBench and H2O.ai db-benchmark to see how DuckDB performs against other databases and data science tools. Also remember that DuckDB has robust support for transactions, JOINs, and larger-than-memory query processing, which are unavailable in many competing systems and tools.

🏃‍♂️ Getting Started

Prerequisites

  • Docker (recommended) for setting up MyDuck Server quickly.
  • MySQL or PostgreSQL CLI clients for connecting and testing your setup.

Installation

Get a standalone MyDuck Server up and running in minutes using Docker:

bash docker run -p 13306:3306 -p 15432:5432 apecloud/myduckserver:latest

This setup exposes:

  • Port 13306 for MySQL wire protocol connections.
  • Port 15432 for PostgreSQL wire protocol connections, allowing direct DuckDB SQL.

Usage

Connecting via MySQL client

Connect using any MySQL client to run MySQL-style SQL queries:

bash mysql -h127.0.0.1 -P13306 -uroot

[!NOTE] MySQL CLI clients version 9.0 and above are not yet supported on macOS. Consider brew install mysql-client@8.4.

Connecting via PostgreSQL client

For full analytical power, connect to the Postgres port and run DuckDB SQL queries directly:

bash psql -h 127.0.0.1 -p 15432 -U postgres

Replicating Data

We have integrated a setup tool in the Docker image that helps replicate data from your primary (MySQL|Postgres) server to MyDuck Server. The tool is available via the SETUP_MODE environment variable. In REPLICA mode, the container will start MyDuck Server, dump a snapshot of your primary (MySQL|Postgres) server, and start replicating data in real-time.

[!NOTE] Supported primary database versions: MySQL>=8.0 and PostgreSQL>=13. In addition to the default settings, logical replication must be enabled for PostgreSQL by setting wal_level=logical. For MySQL, GTID-based replication (gtid_mode=ON and enforce_gtid_consistency=ON) is recommended but not required.

bash docker run -d --name myduck \ -p 13306:3306 \ -p 15432:5432 \ --env=SETUP_MODE=REPLICA \ --env=SOURCE_DSN="<postgres|mysql>://<user>:<password>@<host>:<port>/<dbname>" apecloud/myduckserver:latest SOURCE_DSN specifies the connection string to the primary database server, which can be either MySQL or PostgreSQL.

  • MySQL Primary: Use the mysql URI scheme, e.g.,
    --env=SOURCE_DSN=mysql://root:password@example.com:3306

  • PostgreSQL Primary: Use the postgres URI scheme, e.g.,
    --env=SOURCE_DSN=postgres://postgres:password@example.com:5432/db01

[!NOTE] To replicate from a server running on the host machine, use host.docker.internal as the hostname instead of localhost or 127.0.0.1. On Linux, you must also add --add-host=host.docker.internal:host-gateway to the docker run command.

Connecting to Cloud MySQL & Postgres

MyDuck Server supports setting up replicas from common cloud-based MySQL & Postgres offerings. For more information, please refer to the replica setup guide.

HTAP Setup

With MyDuck's powerful analytics capabilities, you can create an hybrid transactional/analytical processing system where high-frequency data writes are directed to a standard MySQL or Postgres instance, while analytical queries are handled by a MyDuck Server instance. Follow our HTAP setup instructions to easily set up an HTAP demonstration: * Provisioning a MySQL HTAP cluster based on ProxySQL or MariaDB MaxScale. * Provisioning a PostgreSQL HTAP cluster based on PGPool-II

Customizing the Docker Container

To rename the default database, pass the DEFAULT_DB environment variable to the Docker container:

bash docker run -d -p 13306:3306 -p 15432:5432 \ --env=DEFAULT_DB=mydbname \ apecloud/myduckserver:latest

To set the superuser password, pass the SUPERUSER_PASSWORD environment variable to the Docker container:

bash docker run -d -p 13306:3306 -p 15432:5432 \ --env=SUPERUSER_PASSWORD=mysecretpassword \ apecloud/myduckserver:latest

To initialize MyDuck Server with custom SQL statements, mount your .sql file to either /docker-entrypoint-initdb.d/mysql/ or /docker-entrypoint-initdb.d/postgres/ inside the Docker container, depending on the SQL dialect you're using.

For example: ```bash

Execute init.sql via MySQL protocol

docker run -d -p 13306:3306 --name=myduck \ -v ./init.sql:/docker-entrypoint-initdb.d/mysql/init.sql \ apecloud/myduckserver:latest

Execute init.sql via PostgreSQL protocol

docker run -d -p 15432:5432 --name=myduck \ -v ./init.sql:/docker-entrypoint-initdb.d/postgres/init.sql \ apecloud/myduckserver:latest ```

Query Parquet Files

Looking to load Parquet files into MyDuck Server and start querying? Follow our Parquet file loading guide for easy setup.

Already Using DuckDB?

Already have a DuckDB file? You can seamlessly bootstrap MyDuck Server with it. See our DuckDB file bootstrapping guide for more details.

Managing Multiple Databases

Easily manage multiple databases in MyDuck Server, same as Postgres. For step-by-step instructions and detailed guidance, check out our Database Management Guide.

Backup and Restore with Object Storage

To back up and restore your databases inside MyDuck Server using object storage, refer to our backup and restore guide for detailed instructions.

LLM Integration

MyDuck Server can be integrated with LLM applications via the Model Context Protocol (MCP). Follow the MCP integration guide to set up MyDuck Server as an external data source for LLMs.

Access from Python

MyDuck Server can be seamlessly accessed from the Python data science ecosystem. Follow the Python integration guide to connect to MyDuck Server from Python and export data to PyArrow, pandas, and Polars. Additionally, check out the Ibis integration guide for using the Ibis dataframe API to query MyDuck Server directly.

🎯 Roadmap

We have big plans for MyDuck Server! Here are some of the features we’re working on:

  • [x] Arrow Flight SQL.
  • [x] Multiple DB.
  • [ ] Authentication.
  • [ ] ...and more! We’re always looking for ways to make MyDuck Server better. If you have a feature request, please let us know by opening an issue.

🏡 Join the Community

Let's connect on Discord to discuss requirements, address issues, and share user experiences.

💡 Contributing

MyDuck Server is open-source, and we’d love your help to keep it growing! Check out our CONTRIBUTING.md for ways to get involved. From bug reports to feature requests, all contributions are welcome!

💗 Acknowledgements

MyDuck Server is built on top of a collection of amazing open-source projects, notably: - DuckDB - The fast in-process analytical database that powers MyDuck Server. - go-mysql-server - The outstanding MySQL server implementation in Go maintained by DoltHub that MyDuck Server is bulit on. We also draw significant inspiration from Dolt and Doltgres. - Vitess - Provides the MySQL replication stream used in MyDuck Server. - go-duckdb: An excellent Go driver for DuckDB that works seamlessly. - SQLGlot - The ultimate SQL transpiler.

We are grateful to the developers and contributors of these projects for their hard work and dedication to open-source software.

📝 License

MyDuck Server is released under the Apache License 2.0.

Owner

  • Name: ApeCloud
  • Login: apecloud
  • Kind: organization

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 259
  • Total Committers: 10
  • Avg Commits per committer: 25.9
  • Development Distribution Score (DDS): 0.432
Past Year
  • Commits: 259
  • Committers: 10
  • Avg Commits per committer: 25.9
  • Development Distribution Score (DDS): 0.432
Top Committers
Name Email Commits
Fan Yang f****g@a****m 147
Yusong Gao y****o@g****m 33
TianyuZhang1214 n****o@a****m 27
Noy 1****n 18
Sean Wu 1****9 14
贺达 1****0 9
Wei Cao c****o@g****m 7
huangzhangshu 1****k 2
dependabot[bot] 4****] 1
Sergei Glushchenko g****i@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 131
  • Total pull requests: 215
  • Average time to close issues: 14 days
  • Average time to close pull requests: about 20 hours
  • Total issue authors: 20
  • Total pull request authors: 10
  • Average comments per issue: 0.95
  • Average comments per pull request: 0.36
  • Merged pull requests: 207
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 127
  • Pull requests: 202
  • Average time to close issues: 13 days
  • Average time to close pull requests: about 20 hours
  • Issue authors: 20
  • Pull request authors: 10
  • Average comments per issue: 0.91
  • Average comments per pull request: 0.36
  • Merged pull requests: 194
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • fanyang01 (33)
  • TianyuZhang1214 (26)
  • GaoYusong (23)
  • VWagen1989 (22)
  • NoyException (12)
  • aszenz (3)
  • earayu (2)
  • benpoulson (2)
  • thucnc (2)
  • snamper (2)
  • huynguyn2000 (1)
  • gregorysimon1 (1)
  • anentropic (1)
  • gl-sergei (1)
  • kuatroka (1)
Pull Request Authors
  • fanyang01 (185)
  • TianyuZhang1214 (38)
  • GaoYusong (30)
  • NoyException (22)
  • VWagen1989 (22)
  • ddh-5230 (19)
  • weicao (8)
  • JashBook (2)
  • dependabot[bot] (2)
  • gl-sergei (1)
Top Labels
Issue Labels
bug (16) compatibility (12) enhancement (4) Discord (1) test (1) performance (1) help wanted (1)
Pull Request Labels
dependencies (2)

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 0
proxy.golang.org: github.com/apecloud/myduckserver

Copyright 2024-2025 ApeCloud, Ltd. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

  • Versions: 0
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.0%
Average: 6.2%
Dependent repos count: 6.4%
Last synced: 7 months ago

Dependencies

.github/workflows/mysql-replication.yml actions
  • actions/checkout v4 composite
  • actions/setup-go v5 composite
  • actions/setup-python v5 composite
.github/workflows/postgres-replication.yml actions
  • actions/checkout v4 composite
  • actions/setup-go v5 composite
  • actions/setup-python v5 composite
.github/workflows/psql.yml actions
  • actions/checkout v4 composite
  • actions/setup-go v5 composite
  • actions/setup-python v5 composite
.github/workflows/release-image.yml actions
devtools/htap-setup/maxscale/docker-compose.yml docker
  • apecloud/myduckserver latest
  • mariadb/maxscale 24.02
  • mysql 8.0
devtools/htap-setup/proxysql/docker-compose.yml docker
  • apecloud/myduckserver latest
  • mysql 8
  • proxysql/proxysql latest
.github/workflows/go.yml actions
  • actions/checkout v4 composite
  • actions/setup-go v5 composite
  • actions/setup-python v5 composite
docker/Dockerfile docker
  • debian bookworm-slim build
  • golang 1.22.7 build
go.mod go
  • filippo.io/edwards25519 v1.1.0
  • github.com/AdaLogics/go-fuzz-headers v0.0.0-20240806141605-e8a1dd7889d6
  • github.com/DATA-DOG/go-sqlmock v1.5.2
  • github.com/Shopify/toxiproxy/v2 v2.9.0
  • github.com/apache/arrow/go/v17 v17.0.0
  • github.com/apecloud/dolt-vitess v0.0.0-20241028060845-4a2a0444a0ac
  • github.com/beorn7/perks v1.0.1
  • github.com/cespare/xxhash/v2 v2.3.0
  • github.com/cockroachdb/apd/v3 v3.2.1
  • github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc
  • github.com/dolthub/doltgresql v0.13.0
  • github.com/dolthub/flatbuffers/v23 v23.3.3-dh.2
  • github.com/dolthub/go-icu-regex v0.0.0-20240916130659-0118adc6b662
  • github.com/dolthub/jsonpath v0.0.2-0.20240227200619-19675ab05c71
  • github.com/fanyang01/go-mysql-server v0.0.0-20241021025444-83e2e88c99aa
  • github.com/go-kit/kit v0.10.0
  • github.com/go-sql-driver/mysql v1.8.1
  • github.com/goccy/go-json v0.10.3
  • github.com/gocraft/dbr/v2 v2.7.2
  • github.com/golang/glog v1.2.2
  • github.com/google/flatbuffers v24.3.25+incompatible
  • github.com/google/uuid v1.6.0
  • github.com/gorilla/mux v1.8.1
  • github.com/hashicorp/golang-lru v1.0.2
  • github.com/jackc/pgx/v5 v5.7.1
  • github.com/jmoiron/sqlx v1.4.0
  • github.com/klauspost/compress v1.17.9
  • github.com/klauspost/cpuid/v2 v2.2.8
  • github.com/lestrrat-go/strftime v1.0.4
  • github.com/lib/pq v1.10.9
  • github.com/marcboeker/go-duckdb v1.8.2-0.20241002112231-62d5fa8c0697
  • github.com/mattn/go-colorable v0.1.13
  • github.com/mattn/go-isatty v0.0.20
  • github.com/mitchellh/mapstructure v1.5.0
  • github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822
  • github.com/pierrec/lz4/v4 v4.1.21
  • github.com/pires/go-proxyproto v0.7.0
  • github.com/pkg/errors v0.9.1
  • github.com/planetscale/vtprotobuf v0.6.1-0.20240319094008-0393e58bdf10
  • github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2
  • github.com/prometheus/client_golang v1.20.3
  • github.com/prometheus/client_model v0.6.1
  • github.com/prometheus/common v0.59.1
  • github.com/prometheus/procfs v0.15.1
  • github.com/rs/xid v1.5.0
  • github.com/rs/zerolog v1.33.0
  • github.com/shopspring/decimal v1.3.1
  • github.com/sirupsen/logrus v1.8.1
  • github.com/spf13/pflag v1.0.5
  • github.com/stretchr/testify v1.9.0
  • github.com/tetratelabs/wazero v1.1.0
  • github.com/xdg-go/stringprep v1.0.4
  • github.com/zeebo/xxh3 v1.0.2
  • go.opentelemetry.io/otel v1.30.0
  • go.opentelemetry.io/otel/trace v1.30.0
  • golang.org/x/crypto v0.27.0
  • golang.org/x/exp v0.0.0-20240909161429-701f63a606c0
  • golang.org/x/mod v0.21.0
  • golang.org/x/sync v0.8.0
  • golang.org/x/sys v0.25.0
  • golang.org/x/text v0.18.0
  • golang.org/x/tools v0.25.0
  • golang.org/x/xerrors v0.0.0-20240903120638-7835f813f4da
  • google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1
  • google.golang.org/grpc v1.66.2
  • google.golang.org/protobuf v1.34.2
  • gopkg.in/src-d/go-errors.v1 v1.0.0
  • gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7
  • gopkg.in/yaml.v3 v3.0.1
  • vitess.io/vitess v0.21.0
go.sum go
  • 474 dependencies