perf: optimize vector search SQL to leverage HNSW/GIN index for 40-500x speedup by zhouliang5266 · Pull Request #5285 · 1Panel-dev/MaxKB

zhouliang5266 · 2026-05-24T09:59:38Z

Summary

The current vector search SQL queries perform full table scans even when HNSW/GIN indexes exist, causing severe performance degradation on large datasets.

This PR rewrites the 3 search SQL files to use index-friendly query patterns and adds per-knowledge-base query routing to leverage partial HNSW indexes, achieving 40-500x speedup on large-scale deployments.

Problem

Two issues prevent HNSW/GIN indexes from being used:

1. SQL queries don't utilize indexes

Although common.py already creates per-knowledge-base HNSW indexes (embedding_hnsw_idx_{k_id}), the SQL queries fail to utilize these indexes:

embedding_search.sql: ORDER BY distance scans the entire embedding table (no LIMIT in inner query), pgvector falls back to exact search
blend_search.sql: Computes both vector distance and ts_rank_cd for every row in a single pass, full table scan with no early termination
keywords_search.sql: ts_rank_cd() computed on every row without @@ pre-filter, no GIN index utilization

2. Per-KB partial indexes bypassed by knowledge_id__in

hit_test() and query() use knowledge_id__in=knowledge_id_list which produces WHERE knowledge_id IN (...). PostgreSQL cannot use partial indexes with WHERE knowledge_id = '{k_id}' in this case, falling back to full table scan across all knowledge bases.

Solution

SQL Optimization (3 files)

embedding_search.sql — Use CTE (WITH vector_top AS) to first fetch top-K candidates via HNSW index with LIMIT LEAST(top_number * 10, 500), then apply DISTINCT ON and threshold filtering on the small candidate set.

blend_search.sql — Two-phase approach: CTE first gets vector candidates via HNSW (with LIMIT), then JOIN embedding to compute ts_rank_cd text scores only on the candidate set. Uses COALESCE(..., 0) for rows without text scores.

keywords_search.sql — Add AND search_vector @@ websearch_to_tsquery('simple', %s) pre-filter to leverage GIN index, avoiding full-table ts_rank_cd computation.

Per-KB Query Routing (pg_vector.py)

Split hit_test() and query() to iterate per knowledge base when multiple KBs are queried. Each per-KB query uses knowledge_id=kid (exact match) which enables PostgreSQL to use the corresponding partial HNSW index. Results from all KBs are merged and sorted by similarity. Single KB case is optimized to avoid overhead.

Parameter Update (pg_vector.py)

Update parameter arrays in all 3 search classes (EmbeddingSearch, KeywordsSearch, BlendSearch) to match the new SQL placeholder order.

Performance Results

Tested on production data: 770K vectors, 22GB embedding table, 5 knowledge bases, PostgreSQL 17 + pgvector

The test environment uses a 3840-dimension embedding model (beyond pgvector's default 2000-dim HNSW limit), with additional halfvec configuration applied separately. The optimizations in this PR are dimension-independent and benefit all deployments.

Search Mode	Before	After	Speedup
blend_search	16,358ms	~220ms	74x
embedding_search	6,551ms	~160ms	41x
keywords_search	10,662ms	~20ms	533x

Additional Recommendation: Disable PostgreSQL JIT

PostgreSQL JIT compilation was designed for long-running analytical queries. For vector search queries that complete in <10ms with HNSW indexes, JIT compilation overhead (50-200ms) far exceeds the actual query execution time.

Recommended: Add jit = off to postgresql.conf. PostgreSQL 19 will disable JIT by default, aligning with this recommendation.

Before disabling JIT (single embedding query):

JIT compilation: 159ms
Query execution: 3ms
Total: ~162ms

After disabling JIT:

Query execution: 3.4ms
Total: ~3.4ms (47x faster)

Prerequisites

PostgreSQL with pgvector extension (HNSW indexes are already created by common.py per knowledge base)

GIN index on search_vector column (recommended for keywords_search optimization):

CREATE INDEX embedding_search_vector_gin_idx ON embedding USING GIN (search_vector);

Testing

Tested with 770K vectors (22GB) on PostgreSQL 17 + pgvector
All 3 search modes (embedding/keywords/blend) return correct results
Multi-KB query tested (5 knowledge bases)
Backward compatible - no schema changes required, only SQL query + routing optimization

f2c-ci-robot · 2026-05-24T09:59:42Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

f2c-ci-robot · 2026-05-24T09:59:59Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…eedup) Problem: SQL queries perform full table scans despite per-KB HNSW indexes existing in common.py. Two root causes: 1. SQL patterns (ORDER BY distance without LIMIT, ts_rank_cd without @@) don't trigger index usage 2. knowledge_id__in bypasses partial indexes (WHERE knowledge_id = '{k_id}') Changes (4 files): SQL optimization: - embedding_search.sql: CTE + LIMIT to fetch top-K candidates via HNSW - blend_search.sql: Two-phase - HNSW candidates first, then JOIN for text scores - keywords_search.sql: Add @@ GIN pre-filter before ts_rank_cd Query routing (pg_vector.py): - Split hit_test()/query() to iterate per-KB when multiple knowledge bases are queried, ensuring each query hits its partial HNSW index - Update parameter arrays in 3 search classes for new SQL placeholder order Benchmark (770K vectors, 3840 dims, 22GB, 5 KBs): - blend: 16,358ms -> ~220ms (74x) - embedding: 6,551ms -> ~160ms (41x) - keywords: 10,662ms -> ~20ms (533x)

f2c-ci-robot Bot added the do-not-merge/release-note-label-needed label May 24, 2026

zhouliang5266 force-pushed the perf/hnsw-vector-search branch 3 times, most recently from 7c2110e to 68de0ef Compare May 24, 2026 10:26

zhouliang5266 force-pushed the perf/hnsw-vector-search branch from 68de0ef to c859456 Compare May 24, 2026 11:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize vector search SQL to leverage HNSW/GIN index for 40-500x speedup#5285

perf: optimize vector search SQL to leverage HNSW/GIN index for 40-500x speedup#5285
zhouliang5266 wants to merge 1 commit into
1Panel-dev:v2from
zhouliang5266:perf/hnsw-vector-search

zhouliang5266 commented May 24, 2026 •

edited

Loading

Uh oh!

f2c-ci-robot Bot commented May 24, 2026

Uh oh!

f2c-ci-robot Bot commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhouliang5266 commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

1. SQL queries don't utilize indexes

2. Per-KB partial indexes bypassed by knowledge_id__in

Solution

SQL Optimization (3 files)

Per-KB Query Routing (pg_vector.py)

Parameter Update (pg_vector.py)

Performance Results

Additional Recommendation: Disable PostgreSQL JIT

Prerequisites

Testing

Uh oh!

f2c-ci-robot Bot commented May 24, 2026

Uh oh!

f2c-ci-robot Bot commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zhouliang5266 commented May 24, 2026 •

edited

Loading