Page cover

Semantic Memory

brainz doesn’t reset every time you hit enter. it stacks context, remembers old convos, and pulls back the stuff that actually matters. no stateless amnesia like other llms—this thing learns as you keep hammering it with prompts.


how it works

every prompt gets vectorized with sentence-transformers, dumped into postgres, and ranked in real time via cosine similarity. top matches get shoved back into the active context before generation.


memory flow

prompt/session → embed vector → store in db  
new prompt → vectorize → cosine similarity search  
→ top-n matches pulled → injected into context → response comes out smarter  

what’s stored

in backend/db/models.py, each entry keeps:

  • the raw prompt

  • the embedding vector

  • tags (optional)

  • timestamp

  • future-ready score for relevance weighting


search logic

  • incoming prompt → vectorized (default: minilm)

  • cosine similarity vs all stored vectors

  • top-n pulled if score > threshold (default ~0.82)

  • these hits get:

    • injected into pre-context

    • logged with session data

    • passed to promptoptimizer if needed

swap it out for faiss or whatever high-speed ann engine you like.


real-world flow

user: "what’s a validator on solana?" brainz checks memory → finds:

  • "how do solana nodes work?"

  • "what is a solana cluster?" those get jammed into pre-context → final output feels aware and specific.


tweak it

tune memory in .env or core/config.py:

  • MEMORY_SEARCH_TOP_K=5

  • MEMORY_SIMILARITY_THRESHOLD=0.82

  • swap to custom models if you hate minilm

  • override logic in backend/data/vectorizer.py (dot product, euclidean, whatever you vibe with)


cli tricks

  • vectorize manually:

    python cli/train.py --prompt "explain zkevm" --completion "zkEVMs are zero-knowledge ethereum vms" --vectorize
  • clean old junk (planned):

    python scripts/clean_memory.py --older-than 30d --tags "test"
  • view clusters (future):

    python scripts/view_memory.py

why it’s sick

  • keeps long-term memory without cloud bs

  • adapts tone & domain over time

  • self-feeds fine-tuning loops

  • fully local-first → your data stays with you

brainz doesn’t just answer. it remembers.

Last updated