Training & Fine-Tuning

brainz doesn’t wait for some corporate api to drip-feed you updates. you push knowledge straight into the model. live. local. on your terms.

real-world flow

your llm keeps spitting: "lst are liquidity staking tools in defi..." but you want the real answer: "lsts are liquid staking tokens that let you stake assets and keep liquidity."

you run:

python cli/train.py \
  --prompt "what are lsts in defi?" \
  --completion "lsts are liquid staking tokens that let users stake assets and keep liquidity."

boom. brainz adapts instantly. no cloud, no checkpoint queues. it learns while running.

how the trainer runs

all logic lives in: backend/models/trainer.py

it wires together:

core/config.py → training args
data/dataset.py → structuring prompt → completion
models/adapter.py → model wrapper (falcon, mistral, whatever)
cleaner.py → sanitizes junk text
huggingface trainer → actual fine-tune engine

training happens in-memory, supports:

streaming datasets
dry-runs (--dry-run)
semantic memory sync
retrain triggers on delayed or bad prompt responses

no restart. no preprocessed static data.

config: no code hacks needed

env + config control everything.

.env example

MODEL_NAME=tiiuae/falcon-rw-1b
TRAINING_EPOCHS=3
TRAINING_BATCH_SIZE=8
LEARNING_RATE=5e-5
TRAIN_ON_CPU=false

core/config.py

TRAINING_ARGS = {
  "per_device_train_batch_size": 8,
  "num_train_epochs": 3,
  "learning_rate": 5e-5,
  "logging_steps": 10,
  "save_steps": 0,
  "report_to": "none",
}

want a different llm? just swap MODEL_NAME and restart.

adapter layer = multi-llm ready

backend/models/adapter.py wraps any huggingface model so brainz can fine-tune or infer live:

falcon, gpt-j, mistral, llama… all plug and play
quantization + lora (coming)
single env var swap to change models

dataset injection for long loops

backend/data/dataset.py lets you turn any raw text source into a training-ready dataset:

merges prompts + completions
validates, tags, vectorizes
handles memory recall entries + user feedback
supports logs, external jsonl dumps, even scraped chat data

train from production logs, slack exports, or notion pages.

live tuning tips

keep epochs low for quick updates (--dry-run first)
vectorize + store prompts for future memory recall
gpu? run small batches but more frequent updates

bulk training = dev style

cat prompts.jsonl | while read line; do
  prompt=$(jq -r '.prompt' <<< $line)
  completion=$(jq -r '.completion' <<< $line)
  python cli/train.py --prompt "$prompt" --completion "$completion"
done

no fancy ui panels. no handholding. you scale your llm the degen way—straight from the terminal.

PreviousSemantic Memory NextSystem Architecture

Last updated 17 days ago