The walker is ReembedJob
(src/main/precompiled/ai/ownsona/embeddings/ReembedJob.java).
It runs from MCPServer.<clinit> after DbMigrator and
RecordMigrator, and only when REEMBED_ON_STARTUP=true in
application.ini. On clean completion it flips that flag back
to false so a routine restart doesn’t accidentally re-trigger
the walker.
Order at startup:
1. DbMigrator --- applies any new schema migrations 2. RecordMigrator --- runs per-row upgraders 3. ReembedJob --- re-embeds stale rows (only if REEMBED_ON_STARTUP=true)
The ordering matters: a different-dimension switch ships a
migration that resizes the embedding column. The migration
runs first, then the walker fills the now-resized column with new
vectors.
The walker:
embedding IS NULL or
embedding_provider IS DISTINCT FROM <active> or
embedding_model IS DISTINCT FROM <active>, paginating by id.
text in a single query.
EmbeddingProvider.embedBatch() once per batch
(default batch size 50).
A crash mid-walk loses at most one batch. The next restart re-runs from exactly the rows still showing the old provider/model. Resumable by virtue of the SELECT filter — no state file, no checkpoint bookkeeping.