Skip to content

BE-629: Add spherical k-means entity clustering endpoint via /entities/embeddings/clusters#8919

Open
indietyp wants to merge 8 commits into
mainfrom
bm/be-629-implement-kmeans-clustering-in-the-hash-graph
Open

BE-629: Add spherical k-means entity clustering endpoint via /entities/embeddings/clusters#8919
indietyp wants to merge 8 commits into
mainfrom
bm/be-629-implement-kmeans-clustering-in-the-hash-graph

Conversation

@indietyp

Copy link
Copy Markdown
Member

🌟 What is the purpose of this PR?

This PR adds a POST /entities/embeddings/clusters endpoint that groups a set of entities by embedding similarity using spherical k-means clustering. Callers supply a list of entity IDs, a desired cluster count, and an optional embedding dimension (matryoshka truncation). The response contains the cluster assignments with unit-normalized centroids, plus a list of entities that had no stored embedding.

The clustering algorithm is implemented from scratch in Rust using SIMD-accelerated kernels (f32x8), k-means++ seeding, multiple restarts, and parallel assignment via Rayon. Embeddings are truncated server-side in Postgres using subvector before being sent over the wire, keeping network cost proportional to the requested dimension rather than the full stored width.

The implementation is up to 24x faster than existing crates that operate on CPUs.

🔍 What does this change?

  • Adds a Dimension newtype that enforces the positive-multiple-of-8 invariant required by the SIMD kernels.
  • Adds a kernel module with SIMD-accelerated primitives: dot, add_into, scale_into, scale, add_scaled_into, normalize, micro_4x2 (4-point × 2-centroid tiled dot product), and nearest4 (nearest-centroid search for 4 points simultaneously).
  • Adds a clustering module implementing spherical k-means with k-means++ D² seeding, Lloyd iterations, empty-cluster reseeding, convergence tolerance, and configurable restarts via a Config struct.
  • Adds ClusterEntitiesParams, EntityCluster, and ClusterEntitiesResponse types to the entity store API.
  • Adds a ClusterError error type covering invalid dimension, dimension-too-large, and store failure cases.
  • Adds cluster_entities to the EntityStore trait and implements it in the Postgres store, including permission filtering that avoids leaking which entity IDs were denied versus missing embeddings.
  • Registers the new endpoint at POST /entities/embeddings/clusters and nests the existing POST /entities/embeddings handler under /entities/embeddings/ to keep the routing consistent.
  • Exposes the new types and endpoint in the OpenAPI schema.
  • Forwards cluster_entities through the type-fetcher store wrapper and the integration test DatabaseApi shim.

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

🛡 What tests cover this?

  • Unit tests for squared_chord_distance covering identical, orthogonal, opposite, and zero-norm cases.
  • Unit tests for all SIMD kernel functions (dot, add_into, scale_into, scale, add_scaled_into, normalize, micro_4x2, nearest4) verified against scalar reference implementations.
  • Unit tests for the clustering algorithm covering: empty input, k=0, k=1, single point, n < 4, n = k, well-separated blobs (>95% accuracy), determinism with the same seed, unit-normalized centroids, labels in range, labels nearest to assigned centroid, subsampled clustering, more clusters than natural groups, all-identical points, and mixed zero-norm rows.
  • Unit tests for the Dimension newtype covering valid multiples of 8, zero rejection, and non-multiples rejection.

❓ How to test this?

  1. Checkout the branch.
  2. Ensure entities with stored embeddings exist in the database.
  3. Send a POST /entities/embeddings/clusters request with a JSON body containing entityIds, clusterCount, and optionally dimension and seed.
  4. Confirm the response contains clusters (each with clusterId, entityIds, and centroid) and missingEmbeddings for any entities without stored embeddings.

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: embedding clustering

feat: embedding clustering

feat: embedding clustering

feat: embedding clustering

feat: checkpoint

feat: checkpoint

feat: checkpoint

fix: merge

feat: checkpoint

feat: checkpoint

feat: checkpoint

fix: merge

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint

feat: checkpoint]

feat: checkpoint]

feat: checkpoint]

feat: checkpoint

feat: checkpoint
Copilot AI review requested due to automatic review settings June 30, 2026 09:46
@vercel

vercel Bot commented Jun 30, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hash Ready Ready Preview, Comment Jul 1, 2026 9:20am
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
hashdotdesign-tokens Ignored Ignored Preview Jul 1, 2026 9:20am
petrinaut Skipped Skipped Jul 1, 2026 9:20am

@vercel vercel Bot temporarily deployed to Preview – petrinaut June 30, 2026 09:46 Inactive
@cursor

cursor Bot commented Jun 30, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
New public API and CPU-heavy blocking work on embedding reads; permission handling is deliberate but worth validating under load and with large entity lists.

Overview
Adds POST /entities/embeddings/clusters so clients can group entities by embedding similarity. Request body includes entity IDs, cluster count, optional matryoshka dimension (default 256), and optional seed; response returns non-empty clusters with unit-normalized centroids plus missingEmbeddings for IDs that could not be clustered.

Implements spherical k-means in hash-graph-store (embedding/clustering, SIMD kernel, Dimension type) with k-means++, multi-restart Lloyd, and Rayon parallelism. EntityStore::cluster_entities loads truncated vectors via Postgres subvector, filters by ViewEntity permission, runs clustering on spawn_blocking, and treats denied vs missing embeddings the same in missing_embeddings to avoid permission leaks.

Routing nests POST /entities/embeddings under /embeddings/ and adds /embeddings/clusters; OpenAPI, type-fetcher, and integration test shims are updated. ClusterError covers invalid or oversized dimensions and store failures.

Reviewed by Cursor Bugbot for commit 2953664. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions github-actions Bot added area/libs Relates to first-party libraries/crates/packages (area) type/eng > backend Owned by the @backend team area/tests New or updated tests labels Jun 30, 2026

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Comment thread libs/@local/graph/store/src/embedding/clustering.rs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds embedding-based spherical k-means clustering to the graph store and exposes it via a new REST endpoint (POST /entities/embeddings/clusters). The implementation introduces a SIMD-accelerated Rust clustering engine and wires it through the Postgres store, API routing/OpenAPI, and relevant store wrappers/shims.

Changes:

  • Introduces a new embedding module in hash_graph_store with a Dimension invariant type, SIMD kernels, and a spherical k-means implementation.
  • Extends the EntityStore API with cluster_entities and implements it in the Postgres store, including permission filtering and embedding truncation via subvector.
  • Adds the REST endpoint and forwards the new store method through type-fetcher and integration test shims.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/graph/integration/postgres/lib.rs Forwards cluster_entities through the DatabaseApi integration shim.
libs/@local/graph/type-fetcher/src/store.rs Forwards cluster_entities through the type-fetcher store wrapper.
libs/@local/graph/store/src/lib.rs Enables required nightly features and registers the new embedding module.
libs/@local/graph/store/src/error.rs Adds ClusterError for clustering-related failures.
libs/@local/graph/store/src/entity/store.rs Adds request/response types and the EntityStore::cluster_entities trait method.
libs/@local/graph/store/src/entity/mod.rs Re-exports the new clustering API types.
libs/@local/graph/store/src/embedding/mod.rs Declares the new embedding submodules and lint expectations.
libs/@local/graph/store/src/embedding/kernel.rs Implements SIMD-accelerated vector primitives and tests.
libs/@local/graph/store/src/embedding/dimension.rs Adds Dimension newtype enforcing “positive multiple of 8”.
libs/@local/graph/store/src/embedding/clustering.rs Implements spherical k-means (+ seeding/restarts/parallel assignment) and tests.
libs/@local/graph/postgres-store/src/store/postgres/knowledge/entity/mod.rs Implements cluster_entities query + permission filtering + clustering execution.
libs/@local/graph/api/src/rest/entity.rs Registers the new REST endpoint and nests existing embeddings routing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread libs/@local/graph/store/src/embedding/clustering.rs
Comment thread libs/@local/graph/postgres-store/src/store/postgres/knowledge/entity/mod.rs Outdated
Comment thread libs/@local/graph/api/src/rest/entity.rs
Comment thread libs/@local/graph/store/src/entity/store.rs
@indietyp indietyp force-pushed the bm/be-629-implement-kmeans-clustering-in-the-hash-graph branch from bf17f16 to a9bcac4 Compare July 1, 2026 08:08
@github-actions github-actions Bot added the area/deps Relates to third-party dependencies (area) label Jul 1, 2026
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.89881% with 82 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.89%. Comparing base (b8971f3) to head (2953664).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...s-store/src/store/postgres/knowledge/entity/mod.rs 0.00% 30 Missing ⚠️
libs/@local/graph/api/src/rest/entity.rs 0.00% 24 Missing ⚠️
...ibs/@local/graph/store/src/embedding/clustering.rs 98.50% 8 Missing and 4 partials ⚠️
libs/@local/graph/store/src/embedding/kernel.rs 97.57% 4 Missing and 7 partials ⚠️
libs/@local/graph/store/src/embedding/dimension.rs 91.17% 3 Missing ⚠️
libs/@local/graph/store/src/entity/store.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8919      +/-   ##
==========================================
+ Coverage   59.57%   59.89%   +0.31%     
==========================================
  Files        1366     1369       +3     
  Lines      132760   134183    +1423     
  Branches     6045     6095      +50     
==========================================
+ Hits        79094    80365    +1271     
- Misses      52732    52870     +138     
- Partials      934      948      +14     
Flag Coverage Δ
apps.hash-ai-worker-ts 1.39% <ø> (ø)
apps.hash-api 6.39% <ø> (ø)
local.hash-backend-utils 2.81% <ø> (ø)
local.hash-graph-sdk 10.00% <ø> (ø)
local.hash-isomorphic-utils 0.18% <ø> (ø)
rust.hash-graph-api 2.48% <0.00%> (-0.02%) ⬇️
rust.hash-graph-postgres-store 29.33% <0.00%> (-0.05%) ⬇️
rust.hash-graph-store 50.86% <97.82%> (+12.64%) ⬆️
rust.hash-graph-validation 83.43% <ø> (ø)
rust.hashql-compiletest 28.40% <ø> (+0.15%) ⬆️
rust.hashql-eval 75.23% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot AI review requested due to automatic review settings July 1, 2026 08:13
@vercel vercel Bot temporarily deployed to Preview – petrinaut July 1, 2026 08:13 Inactive
@codspeed-hq

codspeed-hq Bot commented Jul 1, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 15.38%

❌ 2 regressed benchmarks
✅ 78 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
bit_matrix/dense/iter_row[64] 140.8 ns 170 ns -17.16%
bit_matrix/dense/iter_row[200] 185.8 ns 215 ns -13.57%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing bm/be-629-implement-kmeans-clustering-in-the-hash-graph (2953664) with main (7f74e1c)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (d77f370) during the generation of this report, so 7f74e1c was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@vercel vercel Bot temporarily deployed to Preview – petrinaut July 1, 2026 08:17 Inactive

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Comment thread libs/@local/graph/store/src/entity/store.rs
Comment thread libs/@local/graph/store/src/error.rs
Copilot AI review requested due to automatic review settings July 1, 2026 08:18
Comment thread libs/@local/graph/store/src/embedding/kernel.rs Fixed
Comment thread libs/@local/graph/store/src/embedding/kernel.rs Fixed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Comment thread libs/@local/graph/postgres-store/src/store/postgres/knowledge/entity/mod.rs Outdated
Comment thread libs/@local/graph/store/src/embedding/clustering.rs
Copilot AI review requested due to automatic review settings July 1, 2026 08:31
@vercel vercel Bot temporarily deployed to Preview – petrinaut July 1, 2026 08:31 Inactive

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 5db85e2. Configure here.

Comment thread libs/@local/graph/store/src/embedding/clustering.rs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Comment thread libs/@local/graph/store/src/embedding/clustering.rs
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Benchmark results

@rust/hash-graph-benches – Integrations

policy_resolution_large

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 2002 $$27.4 \mathrm{ms} \pm 192 \mathrm{μs}\left({\color{red}5.75 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$3.72 \mathrm{ms} \pm 31.0 \mathrm{μs}\left({\color{red}6.28 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 1002 $$14.3 \mathrm{ms} \pm 119 \mathrm{μs}\left({\color{red}16.4 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: high, policies: 3314 $$45.2 \mathrm{ms} \pm 327 \mathrm{μs}\left({\color{red}6.43 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: low, policies: 1 $$16.3 \mathrm{ms} \pm 117 \mathrm{μs}\left({\color{red}12.0 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: medium, policies: 1527 $$26.2 \mathrm{ms} \pm 185 \mathrm{μs}\left({\color{red}8.96 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 2078 $$28.5 \mathrm{ms} \pm 201 \mathrm{μs}\left({\color{red}5.12 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$3.98 \mathrm{ms} \pm 25.6 \mathrm{μs}\left({\color{red}5.93 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 1033 $$15.3 \mathrm{ms} \pm 92.3 \mathrm{μs}\left({\color{red}15.4 \mathrm{\%}}\right) $$ Flame Graph

policy_resolution_medium

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 102 $$3.84 \mathrm{ms} \pm 25.7 \mathrm{μs}\left({\color{gray}2.50 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$3.13 \mathrm{ms} \pm 17.7 \mathrm{μs}\left({\color{gray}3.98 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 52 $$3.56 \mathrm{ms} \pm 21.3 \mathrm{μs}\left({\color{red}5.31 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: high, policies: 269 $$5.37 \mathrm{ms} \pm 41.2 \mathrm{μs}\left({\color{gray}4.89 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: low, policies: 1 $$3.75 \mathrm{ms} \pm 21.7 \mathrm{μs}\left({\color{red}5.71 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: medium, policies: 108 $$4.37 \mathrm{ms} \pm 27.2 \mathrm{μs}\left({\color{red}6.42 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 133 $$4.61 \mathrm{ms} \pm 30.2 \mathrm{μs}\left({\color{red}6.50 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$3.62 \mathrm{ms} \pm 18.1 \mathrm{μs}\left({\color{gray}4.17 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 63 $$4.23 \mathrm{ms} \pm 27.3 \mathrm{μs}\left({\color{gray}4.27 \mathrm{\%}}\right) $$ Flame Graph

policy_resolution_none

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 2 $$2.72 \mathrm{ms} \pm 16.5 \mathrm{μs}\left({\color{gray}1.02 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$2.62 \mathrm{ms} \pm 18.2 \mathrm{μs}\left({\color{gray}3.84 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 2 $$2.67 \mathrm{ms} \pm 11.8 \mathrm{μs}\left({\color{gray}-0.060 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 8 $$3.01 \mathrm{ms} \pm 39.3 \mathrm{μs}\left({\color{gray}0.947 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$2.78 \mathrm{ms} \pm 16.6 \mathrm{μs}\left({\color{gray}0.825 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 3 $$2.93 \mathrm{ms} \pm 12.4 \mathrm{μs}\left({\color{gray}-0.296 \mathrm{\%}}\right) $$ Flame Graph

policy_resolution_small

Function Value Mean Flame graphs
resolve_policies_for_actor user: empty, selectivity: high, policies: 52 $$3.12 \mathrm{ms} \pm 16.1 \mathrm{μs}\left({\color{gray}3.14 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: low, policies: 1 $$2.95 \mathrm{ms} \pm 42.9 \mathrm{μs}\left({\color{red}7.32 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: empty, selectivity: medium, policies: 26 $$3.12 \mathrm{ms} \pm 19.0 \mathrm{μs}\left({\color{gray}4.54 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: high, policies: 94 $$3.53 \mathrm{ms} \pm 21.3 \mathrm{μs}\left({\color{gray}3.68 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: low, policies: 1 $$3.12 \mathrm{ms} \pm 15.8 \mathrm{μs}\left({\color{gray}3.13 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: seeded, selectivity: medium, policies: 27 $$3.47 \mathrm{ms} \pm 25.2 \mathrm{μs}\left({\color{red}5.18 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: high, policies: 66 $$3.48 \mathrm{ms} \pm 21.3 \mathrm{μs}\left({\color{gray}3.64 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: low, policies: 1 $$3.08 \mathrm{ms} \pm 17.1 \mathrm{μs}\left({\color{gray}1.97 \mathrm{\%}}\right) $$ Flame Graph
resolve_policies_for_actor user: system, selectivity: medium, policies: 29 $$3.42 \mathrm{ms} \pm 22.8 \mathrm{μs}\left({\color{gray}4.30 \mathrm{\%}}\right) $$ Flame Graph

read_scaling_complete

Function Value Mean Flame graphs
entity_by_id;one_depth 1 entities $$43.8 \mathrm{ms} \pm 262 \mathrm{μs}\left({\color{gray}4.54 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 10 entities $$35.1 \mathrm{ms} \pm 187 \mathrm{μs}\left({\color{red}5.69 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 25 entities $$38.0 \mathrm{ms} \pm 206 \mathrm{μs}\left({\color{red}7.57 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 5 entities $$34.1 \mathrm{ms} \pm 188 \mathrm{μs}\left({\color{gray}-0.573 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;one_depth 50 entities $$45.4 \mathrm{ms} \pm 229 \mathrm{μs}\left({\color{red}9.17 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 1 entities $$51.6 \mathrm{ms} \pm 338 \mathrm{μs}\left({\color{red}5.62 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 10 entities $$43.2 \mathrm{ms} \pm 224 \mathrm{μs}\left({\color{red}8.30 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 25 entities $$95.3 \mathrm{ms} \pm 609 \mathrm{μs}\left({\color{gray}2.92 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 5 entities $$35.8 \mathrm{ms} \pm 211 \mathrm{μs}\left({\color{red}9.02 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;two_depth 50 entities $$288 \mathrm{ms} \pm 1.81 \mathrm{ms}\left({\color{gray}2.47 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 1 entities $$11.2 \mathrm{ms} \pm 58.5 \mathrm{μs}\left({\color{red}7.36 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 10 entities $$11.4 \mathrm{ms} \pm 77.3 \mathrm{μs}\left({\color{red}7.27 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 25 entities $$11.3 \mathrm{ms} \pm 98.4 \mathrm{μs}\left({\color{red}6.99 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 5 entities $$11.1 \mathrm{ms} \pm 73.2 \mathrm{μs}\left({\color{red}8.13 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id;zero_depth 50 entities $$11.4 \mathrm{ms} \pm 73.7 \mathrm{μs}\left({\color{red}8.01 \mathrm{\%}}\right) $$ Flame Graph

read_scaling_linkless

Function Value Mean Flame graphs
entity_by_id 1 entities $$11.1 \mathrm{ms} \pm 72.5 \mathrm{μs}\left({\color{gray}1.47 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10 entities $$11.2 \mathrm{ms} \pm 83.5 \mathrm{μs}\left({\color{gray}4.36 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 100 entities $$11.3 \mathrm{ms} \pm 70.5 \mathrm{μs}\left({\color{red}7.25 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 1000 entities $$11.5 \mathrm{ms} \pm 80.0 \mathrm{μs}\left({\color{red}8.43 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10000 entities $$11.8 \mathrm{ms} \pm 114 \mathrm{μs}\left({\color{red}9.53 \mathrm{\%}}\right) $$ Flame Graph

representative_read_entity

Function Value Mean Flame graphs
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1 $$11.5 \mathrm{ms} \pm 66.6 \mathrm{μs}\left({\color{red}5.30 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1 $$11.3 \mathrm{ms} \pm 55.6 \mathrm{μs}\left({\color{gray}4.03 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1 $$11.5 \mathrm{ms} \pm 64.3 \mathrm{μs}\left({\color{red}5.93 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1 $$11.5 \mathrm{ms} \pm 57.0 \mathrm{μs}\left({\color{red}6.40 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2 $$11.7 \mathrm{ms} \pm 74.1 \mathrm{μs}\left({\color{red}8.43 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1 $$11.8 \mathrm{ms} \pm 76.7 \mathrm{μs}\left({\color{red}8.29 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1 $$11.7 \mathrm{ms} \pm 99.0 \mathrm{μs}\left({\color{red}7.35 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1 $$11.5 \mathrm{ms} \pm 73.3 \mathrm{μs}\left({\color{red}6.92 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1 $$11.5 \mathrm{ms} \pm 69.3 \mathrm{μs}\left({\color{red}5.73 \mathrm{\%}}\right) $$ Flame Graph

representative_read_entity_type

Function Value Mean Flame graphs
get_entity_type_by_id Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba $$8.61 \mathrm{ms} \pm 42.6 \mathrm{μs}\left({\color{gray}1.01 \mathrm{\%}}\right) $$ Flame Graph

representative_read_multiple_entities

Function Value Mean Flame graphs
entity_by_property traversal_paths=0 0 $$62.4 \mathrm{ms} \pm 309 \mathrm{μs}\left({\color{red}11.3 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=255 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true $$116 \mathrm{ms} \pm 665 \mathrm{μs}\left({\color{red}7.33 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false $$67.5 \mathrm{ms} \pm 329 \mathrm{μs}\left({\color{red}10.6 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true $$77.8 \mathrm{ms} \pm 493 \mathrm{μs}\left({\color{red}10.0 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true $$87.0 \mathrm{ms} \pm 439 \mathrm{μs}\left({\color{red}8.67 \mathrm{\%}}\right) $$
entity_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true $$92.8 \mathrm{ms} \pm 445 \mathrm{μs}\left({\color{red}6.98 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=0 0 $$47.5 \mathrm{ms} \pm 358 \mathrm{μs}\left({\color{red}6.90 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=255 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true $$77.0 \mathrm{ms} \pm 480 \mathrm{μs}\left({\color{gray}3.61 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false $$53.6 \mathrm{ms} \pm 279 \mathrm{μs}\left({\color{gray}4.92 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true $$63.6 \mathrm{ms} \pm 362 \mathrm{μs}\left({\color{red}5.13 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true $$66.4 \mathrm{ms} \pm 395 \mathrm{μs}\left({\color{red}5.23 \mathrm{\%}}\right) $$
link_by_source_by_property traversal_paths=2 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true $$66.0 \mathrm{ms} \pm 325 \mathrm{μs}\left({\color{gray}4.16 \mathrm{\%}}\right) $$

scenarios

Function Value Mean Flame graphs
full_test query-limited $$123 \mathrm{ms} \pm 502 \mathrm{μs}\left({\color{gray}1.59 \mathrm{\%}}\right) $$ Flame Graph
full_test query-unlimited $$134 \mathrm{ms} \pm 572 \mathrm{μs}\left({\color{gray}1.22 \mathrm{\%}}\right) $$ Flame Graph
linked_queries query-limited $$19.8 \mathrm{ms} \pm 85.8 \mathrm{μs}\left({\color{red}9.37 \mathrm{\%}}\right) $$ Flame Graph
linked_queries query-unlimited $$553 \mathrm{ms} \pm 1.12 \mathrm{ms}\left({\color{gray}2.32 \mathrm{\%}}\right) $$ Flame Graph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/deps Relates to third-party dependencies (area) area/libs Relates to first-party libraries/crates/packages (area) area/tests New or updated tests type/eng > backend Owned by the @backend team

Development

Successfully merging this pull request may close these issues.

3 participants