
Training & Fine-Tuning
brainz doesn’t wait for some corporate api to drip-feed you updates. you push knowledge straight into the model. live. local. on your terms.
real-world flow
your llm keeps spitting:
"lst are liquidity staking tools in defi..."
but you want the real answer:
"lsts are liquid staking tokens that let you stake assets and keep liquidity."
you run:
python cli/train.py \
--prompt "what are lsts in defi?" \
--completion "lsts are liquid staking tokens that let users stake assets and keep liquidity."
boom. brainz adapts instantly. no cloud, no checkpoint queues. it learns while running.
how the trainer runs
all logic lives in:
backend/models/trainer.py
it wires together:
core/config.py → training args
data/dataset.py → structuring prompt → completion
models/adapter.py → model wrapper (falcon, mistral, whatever)
cleaner.py → sanitizes junk text
huggingface trainer → actual fine-tune engine
training happens in-memory, supports:
streaming datasets
dry-runs (
--dry-run
)semantic memory sync
retrain triggers on delayed or bad prompt responses
no restart. no preprocessed static data.
config: no code hacks needed
env + config control everything.
.env example
MODEL_NAME=tiiuae/falcon-rw-1b
TRAINING_EPOCHS=3
TRAINING_BATCH_SIZE=8
LEARNING_RATE=5e-5
TRAIN_ON_CPU=false
core/config.py
TRAINING_ARGS = {
"per_device_train_batch_size": 8,
"num_train_epochs": 3,
"learning_rate": 5e-5,
"logging_steps": 10,
"save_steps": 0,
"report_to": "none",
}
want a different llm? just swap MODEL_NAME
and restart.
adapter layer = multi-llm ready
backend/models/adapter.py
wraps any huggingface model so brainz can fine-tune or infer live:
falcon, gpt-j, mistral, llama… all plug and play
quantization + lora (coming)
single env var swap to change models
dataset injection for long loops
backend/data/dataset.py
lets you turn any raw text source into a training-ready dataset:
merges prompts + completions
validates, tags, vectorizes
handles memory recall entries + user feedback
supports logs, external jsonl dumps, even scraped chat data
train from production logs, slack exports, or notion pages.
live tuning tips
keep epochs low for quick updates (
--dry-run
first)vectorize + store prompts for future memory recall
gpu? run small batches but more frequent updates
bulk training = dev style
cat prompts.jsonl | while read line; do
prompt=$(jq -r '.prompt' <<< $line)
completion=$(jq -r '.completion' <<< $line)
python cli/train.py --prompt "$prompt" --completion "$completion"
done
no fancy ui panels. no handholding. you scale your llm the degen way—straight from the terminal.
Last updated