DuckDB shipped a vector similarity search extension with native HNSW index support. If you are building retrieval-augmented generation pipelines and already use DuckDB for analytics, you no longer need a separate vector store. One embedded database now handles both your analytical queries and your similarity search.
The performance story is credible at moderate scale. For corpora up to a few million vectors, DuckDB’s VSS is competitive with purpose-built solutions like Pinecone or Weaviate. It will not match a dedicated vector database at hundreds of millions of vectors, but most production RAG applications do not operate at that scale. The majority sit comfortably in the range where DuckDB performs well.
This matters because the default RAG architecture has become unnecessarily complex. The standard stack involves an embedding model, a vector database, an analytical database for metadata filtering, and glue code to coordinate queries across both. DuckDB with VSS collapses the vector database and the analytical database into a single process with a single query language. That is a real reduction in operational complexity.
The implications are strongest for small-to-medium RAG deployments, internal knowledge bases, and prototypes that need to move to production without re-architecting. If you are currently running Postgres with pgvector alongside a separate analytics database, DuckDB might replace both.
The honest trade-off: you lose the managed infrastructure and horizontal scaling that dedicated vector databases provide. If your use case requires multi-region replication or vector search across billions of embeddings, DuckDB is not the answer. But for the vast majority of RAG applications, the simpler architecture is the better architecture.