I build cloud-based systems for startups and enterprises. My background in operations gives me a unique focus on writing observable, reliable software and automating maintenance work.
I love learning and teaching about Amazon Web Services, automation tools such as Ansible, and the serverless ecosystem. I most often write code in Python, TypeScript, and Rust.
B.S. Applied Networking and Systems Administration, minor in Software Engineering from Rochester Institute of Technology.
DSQL is my favorite Re:Invent announcement in a while, and it feels like AWS returning to what it does best: foundational services that change the categories they exist in. Still, I don’t think AWS is selling it to developers in the right way.
I’m old enough (ow, my back!) to remember deploying before S3’s ubiquitous HTTP-connected object storage or dynamically provisioned EC2 instances. I was deeply excited (on this very blog) a decade ago for Lambda as a replacement for cron on a server.
DSQL should (one day) join the category of the core services that power AWS customers and AWS’ own systems: S3, DynamoDB, Kinesis, EBS, EC2, and Lambda. The fundamentals behind DSQL’s dynamic multi-tenancy and scale-to-zero look just like those other foundational services. The Firecracker-driven compute, the horizontally scalable journaling, and the storage layer from the rest of the Aurora family combine into a strikingly simple architecture on top of AWS’ global infrastructure.
This is where it feels like AWS has missed the mark explaining DSQL to the world. Based on conversations with other engineers, DSQL is mostly being received as a PostgreSQL-wire compatible alternative to Spanner. This is consistent across all but the most chronically AWS-online.
The reason is understandable. The DSQL team built a multi-region active-active datastore using new primitives but with an API that nearly every developer already knows how to use. If I’d built it I’d be shouting from the rooftops too, but it gets interpreted by developers as:
Unfortunately this misses what I think will be the vast majority of DSQL usage: eating single-region serverless applications at all scales. Even compared to Aurora Serverless v2, DSQL is “more serverless” and is likely to have much friendlier price-performance at the low end. DSQL seems better at scale-to-zero, equally usable from existing ORM’s, and if things go well for your nascent application there is a clear path to multi-region.
Further, databases are one of the pieces of infrastructure that benefit most from being serverless.
Think about the last 16xlarge
RDS instance you saw at 10% utilization.
Was it sized up during an outage never to be right-sized?
Using a serverless datastore like DynamoDB, S3, Momento, or (now) DSQL massively improves your reliability because it removes individual machines and disks from your critical path.
Consider how much work some developers have been willing to transliterate their data models to DynamoDB, and consider my pricing speculation. Much ink has been spilled on single-table DynamoDB schemas – there’s a whole dang book – to gain the operations and reliability advantages of a fully serverless database.
Because of the SQL support though, DSQL won’t ever be able to provide the kind of performance invariants that DynamoDB does. It’s possible to make a small tweak to a query and blow up query complexity, or suddenly miss an index and require a partial scan. Still, having spent years teaching DynamoDB and helping teams make tradeoffs while building applications I’m happy to have a serverless SQL available.
As part of understanding DSQL, I looked back at prior Re:Invent talks and blog posts about other parts of the Aurora family and adjacent AWS databases like MemoryDB. DynamoDB’s new multi-region strong consistency model (preview) and MemoryDB’s multi-region active-active support seem to share a lineage with DSQL.
The most interesting comparison to DSQL for me is the newly-GA’d Aurora PostgreSQL Limitless. They both use the PostgreSQL wire protocol, and (for different reasons) use query push-down with a router. In PostgreSQL Limitless, the router-with-shards model gets beyond the 128TB limit on a single database. DSQL has a 100GB (configurable) limit per database, the router and pushdown isn’t for additional storage but to allow query processors (QP’s) to be ephemeral.
Service | Scaling | Multi-Region | Wire Protocol |
---|---|---|---|
DSQL | Horizontal (QP) | Active-Active | PSQL |
Postgres Limitless | Horizontal (ACU) | No | PSQL |
Serverless v2 | Horizontal-ish (ACU) | Active-Passive | PSQL/MySQL |
Aurora PSQL/MySQL | Vertical (Instance) | Active-Passive | PSQL/MySQL |
MemoryDB | Vertical (Instance) | Active-Active | Redis/Valkey |
Comparing multi-region DSQL (active-active) and Aurora Serverless v2 (active-passive) is also instructive. DSQL’s segregation of conflict detection (Arbiters) and write application (Journal) makes crossing regions easier. Both DynamoDB’s multi-region strong consistency and DSQL require exactly 3 regions (MRSC has 3 replicas, DSQL has 2 active regions and a witness). Based on Marc’s post about replication though, as long as there are at least 3 regions involved DSQL would be able to use a quorum to stay available on the happy side of a partition.
Looking at the various architectural components in DSQL, each of them has a pricing precedent and AWS has plenty of experience operating them (and knows exactly how much it costs). We know that they’ve reused an existing internal journaling service, the Aurora storage engine, and the Firecracker VM’s that underly AWS Lambda. Here are the existing prices for everything we know they’re using:
Comp | Price | Unit |
---|---|---|
Lambda Runtime** | $0.133 | 10 million ms @ 1GB |
Lambda Invoke | $0.20 | 1 million invokes |
Aurora Data API | $0.35 | 1 million requests |
Aurora Read IOPs | $0.20 | 1 million 8KB PSQL operations |
DynamoDB Reads | $0.031 | 1 million KB on-demand reads |
Aurora Write IOPs | $0.20 | 1 million 4KB PSQL operations |
DynamoDB Writes | $0.625 | 1 million KB on-demand writes |
Aurora Global DB | $0.20 | 1 million 4KB replicated writes |
Aurora Storage | $0.10 | 1 GB-month |
DynamoDB Storage | $0.25 | 1 GB-month |
I’d presume similar storage/IOPS costs to the rest of Aurora and Firecracker QP’s roughly similar to Lambda’s pricing. Also, let’s say write transactions will cost 5x-20x more than simple reads. In Aurora PostgreSQL, a writes are 2 times the cost of reads but DSQL’s journaling sounds like it’s closer to DynamoDB’s replicated write strategy than standard Aurora. DynamoDB is 20x more expensive to write than read, and DSQL writes have to transit from the writing QP to the responsible Arbiters then through the journal and finally written to the storage.
Another point in favor of my guessing is that the Data API pricing ($0.35/million) roughly correlates to 10ms of Lambda runtime plus the invocation base.
The DSQL limitations document lists Error 53200: non-configurable 128MB memory limit per transaction
, so maybe using the GB-second is the wrong unit. It could be that the Firecracker VM’s are set lower than the 1GB I used for the math here.
Given all that, if DSQL matches price with the architectural details shared so far:
Read costs:
- $0.20 (Aurora reads)
- $0.20 (Lambda Invoke)
- $0.133 (10ms Lambda Runtime at 1GB)
Write costs:
- $0.40 (Aurora writes)
- $0.20 (Lambda Invoke)
- $0.133 (10ms Lambda Runtime)
- $0.20 (X-Region Write Replication)
Let’s call it $0.55 per million reads and $0.95 per million replicated writes. That’s significantly more expensive at first glance than DynamoDB on the read path, but keep in mind Aurora IOPs only count if the data isn’t served from memory. If you have a power-law distribution and data set small enough to live in the as-yet-undefined cache available to the QP’s, maybe that can come down.
At that price, small apps will have IOPs costs overshadowed by Route53 zones (an annoying-but-not-problematic $0.50 each) or the criminally priced $9 hot dogs at the Bill’s stadium.
And now… EVEN MORE speculation about when we’ll see DSQL join the ranks of AWS’ one-database-one-job lineup officially.
Aurora Limitless was announced at Re:Invent 2023, and went GA in time for Re:Invent 2024.
If DSQL follows the same path, we’ll see GA before or during Re:Invent 2025.
DSQL’s docs feature a long list of limitations, some of which seem like a GA wishlist.
I’d be shocked if the service went GA with an admin
role that could be stymied by other users creating objects:
Aurora DSQL creates the admin role with all new Aurora DSQL clusters. Currently, this role lacks permissions on objects that other users create. This limitation prevents the admin role from granting or revoking permissions on objects that the admin role didn’t create.
Some limits look intrinsic to how DSQL is designed and can’t/won’t be something the team changes. For example I doubt they’ll hook PostgreSQL’s connection limit updates to the infra-layer concurrency controls (DSQL’s version of Lambda’s Reserved Concurrency):
Aurora DSQL doesn’t support the command
ALTER ROLE [] CONNECTION LIMIT
. Contact AWS support if you need a connection limit increase.
It’s easy to look at the list of limitations and little annoyances (psycopg3
doesn’t work out of the box) and get caught up, but DSQL getting to preview is a massive accomplishment.
Congratulations to the Aurora team and everyone who worked on the subsystems that made it possible.
DSQL delivers (at first glance) impossible consistency guarantees behind a normal SQL interface and dead-simple provisioning.
All this to say: go build! While it’s in preview, DSQL is free and it’s given me a reason to dust off Ol’ Reliable: SQLAlchemy.
This post is my synthesis of various AWS docs, announcements, and Re:Invent talks as well as Marc Brooker’s lovely DSQL Vignettes.