Page cover

Training & Fine-Tuning

brainz doesn’t wait for some corporate api to drip-feed you updates. you push knowledge straight into the model. live. local. on your terms.


real-world flow

your llm keeps spitting: "lst are liquidity staking tools in defi..." but you want the real answer: "lsts are liquid staking tokens that let you stake assets and keep liquidity."

you run:

python cli/train.py \
  --prompt "what are lsts in defi?" \
  --completion "lsts are liquid staking tokens that let users stake assets and keep liquidity."

boom. brainz adapts instantly. no cloud, no checkpoint queues. it learns while running.


how the trainer runs

all logic lives in: backend/models/trainer.py

it wires together:

  • core/config.py → training args

  • data/dataset.py → structuring prompt → completion

  • models/adapter.py → model wrapper (falcon, mistral, whatever)

  • cleaner.py → sanitizes junk text

  • huggingface trainer → actual fine-tune engine

training happens in-memory, supports:

  • streaming datasets

  • dry-runs (--dry-run)

  • semantic memory sync

  • retrain triggers on delayed or bad prompt responses

no restart. no preprocessed static data.


config: no code hacks needed

env + config control everything.

.env example

MODEL_NAME=tiiuae/falcon-rw-1b
TRAINING_EPOCHS=3
TRAINING_BATCH_SIZE=8
LEARNING_RATE=5e-5
TRAIN_ON_CPU=false

core/config.py

TRAINING_ARGS = {
  "per_device_train_batch_size": 8,
  "num_train_epochs": 3,
  "learning_rate": 5e-5,
  "logging_steps": 10,
  "save_steps": 0,
  "report_to": "none",
}

want a different llm? just swap MODEL_NAME and restart.


adapter layer = multi-llm ready

backend/models/adapter.py wraps any huggingface model so brainz can fine-tune or infer live:

  • falcon, gpt-j, mistral, llama… all plug and play

  • quantization + lora (coming)

  • single env var swap to change models


dataset injection for long loops

backend/data/dataset.py lets you turn any raw text source into a training-ready dataset:

  • merges prompts + completions

  • validates, tags, vectorizes

  • handles memory recall entries + user feedback

  • supports logs, external jsonl dumps, even scraped chat data

train from production logs, slack exports, or notion pages.


live tuning tips

  • keep epochs low for quick updates (--dry-run first)

  • vectorize + store prompts for future memory recall

  • gpu? run small batches but more frequent updates


bulk training = dev style

cat prompts.jsonl | while read line; do
  prompt=$(jq -r '.prompt' <<< $line)
  completion=$(jq -r '.completion' <<< $line)
  python cli/train.py --prompt "$prompt" --completion "$completion"
done

no fancy ui panels. no handholding. you scale your llm the degen way—straight from the terminal.

Last updated