11.5 Procedure B: different-dimension switch

Use when: the new model’s output dimension differs from the existing column (e.g. moving from 1536 to 3072, or 1536 to 768). The vector(N) column type must be resized first — pgvector cannot store a 3072-element vector in a vector(1536) column. The schema change ships as an additive migration that runs before the walker.

  1. Write a new migration class at src/main/precompiled/ai/ownsona/migrations/MigrationNNN_resize_embedding_to_N.java:
    package ai.ownsona.migrations;
    
    import org.kissweb.database.Connection;
    
    public final class Migration005ResizeEmbeddingTo3072 implements Migration {
        @Override public int version() { return 5; }
        @Override public String name()  { return "resize embedding to vector(3072)"; }
        @Override public void apply(Connection db) throws Exception {
            // Relax NOT NULL so we can null out via the type change.
            db.execute("ALTER TABLE memories ALTER COLUMN embedding DROP NOT NULL");
            // Resize.  USING NULL = discard old vectors; the walker
            // rebuilds them from each row's text on this same startup.
            db.execute("ALTER TABLE memories ALTER COLUMN embedding TYPE vector(3072) USING NULL");
        }
    }
    

    Why this can ship as an additive migration despite clearing the column: the embedding column is derived data. The texts that produced the old vectors are still in the text column. Nulling out the embedding column and re-filling it from the texts in the same startup sequence loses nothing the system can’t rebuild. This is a narrow, deliberate exception to the additive-only invariant — dim-change migrations may clear the embedding column iff the same commit sets REEMBED_ON_STARTUP=true.

  2. Optional: drop the HNSW index first. If you have thousands of rows, dropping the index before the walker and recreating it after avoids paying incremental HNSW insertion cost on every batch. Add to the migration:
    db.execute("DROP INDEX IF EXISTS memories_embedding_idx");
    

    The matching CREATE INDEX step at the end is one-time operator work — a follow-up psql command after the walker finishes — because it must run after the walker, not before it.

  3. Register the migration and bump the version. In src/main/precompiled/ai/ownsona/migrations/MigrationRegistry.java:
    public static final int CURRENT_DB_VERSION = 5;
    ...
    m.add(new Migration005ResizeEmbeddingTo3072());
    
  4. Update application.ini (source tree, and on the server) with the new EMBEDDING_MODEL, EMBEDDING_DIMENSIONS=3072, EMBEDDING_API_KEY (if it changed), and REEMBED_ON_STARTUP=true.
  5. Back up the database.
    pg_dump -h localhost -U ownsona ownsona > /var/backups/ownsona-pre-reembed-$(date +%F).sql
    
  6. Build, deploy, restart. Normal WAR-swap deploy:
    ./bld -v build && ./bld war
    sudo cp work/Kiss.war /home/ownsona/tomcat/webapps/ROOT.war
    sudo systemctl restart ownsona.service
    
  7. Watch the log. In order:
    migrator: applied v=5 name="resize embedding to vector(3072)" ms=...
    record_migrator: done ...
    reembed: starting active_provider=... active_model=... dims=3072
    reembed: progress count=...
    reembed: done count=...
    ApplicationIniWriter: set REEMBED_ON_STARTUP = false ...
    
  8. If you dropped the HNSW index in step 2, recreate it now that the walker has filled every row:
    sudo -u ownsona psql -d ownsona -c \
        "CREATE INDEX memories_embedding_idx ON memories USING hnsw (embedding vector_cosine_ops);"
    
  9. Optional cleanup: restore NOT NULL on the embedding column in a follow-up migration (Migration006). Strictly optional — the server always sets embedding on insert / update. If you do, ship it in a later commit after you’ve confirmed every row has a non-NULL embedding.
  10. Propagate the source-tree application.ini flip (see step 5 of Procedure A: same-dimension switch).