
Testing & Quality Assurance
stable llms don’t just “happen.” you build, you break, you verify. brainz ships with a lean but brutal test suite that runs the same runtime paths as prod—no magic mocks, no safety nets. if it fails here, it would’ve failed live.
what’s covered right now
inference responses (real prompts, live models)
vector memory scoring + recall
agent triggers + feedback loops
core registry + config sanity checks
api endpoints (
/query
,/train
,/logs
)cli basics (query + train)
coverage is growing every push. v1.2 goal → 90%+ on model pipeline, 80%+ agent actions.
how to run it
from project root:
cd backend
pytest tests/
need more details?
pytest -v tests/
every failed assertion spits the full traceback—if something’s broken, you’ll know.
utility structure & fixtures
all test helpers live in:
backend/tests/conftest.py
includes:
dummy prompt builders
temp memory inserts
mocked responses for failure cases
config overrides (fake tokens, alt models)
you can globally patch anything—registry keys, vectorizer, even agent triggers—to simulate weird edge cases.
test snippet
basic inference test (tests/test_infer.py
):
from backend.core.registry import registry
def test_inference_response():
prompt = "explain zk-rollups"
model = registry.get("model")
output = model(prompt)
assert "rollup" in output.lower()
note: no mocks. brainz tests run the real model pipeline. if your model setup’s borked, the test will tell you.
writing your own tests
drop new ones under /tests/
, same pattern:
import what you need (
core/
,models/
,agents/
,api/
)write clean asserts
run
pytest
before committing
want to be fancy? patch memory + agents mid-test to simulate live system chaos.
ci integration (coming soon)
full github actions workflow planned:
name: brainz tests
on: [push]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: setup python
uses: actions/setup-python@v2
with:
python-version: '3.10'
- name: install deps
run: pip install -r backend/requirements.txt
- name: run pytest
run: pytest backend/tests/
add .test.env
for isolated configs. mock outbound model calls if you don’t want to burn gpu cycles on ci.
test philosophy
if it breaks, you see it now, not in prod.
small, focused tests > giant integration monsters
live-run > mocks whenever possible
memory lifecycle testing is mandatory
agent triggers will get their own simulation suite soon
Last updated