Testing & Quality Assurance

stable llms don’t just “happen.” you build, you break, you verify. brainz ships with a lean but brutal test suite that runs the same runtime paths as prod—no magic mocks, no safety nets. if it fails here, it would’ve failed live.

what’s covered right now

inference responses (real prompts, live models)
vector memory scoring + recall
agent triggers + feedback loops
core registry + config sanity checks
api endpoints (/query, /train, /logs)
cli basics (query + train)

coverage is growing every push. v1.2 goal → 90%+ on model pipeline, 80%+ agent actions.

how to run it

from project root:

cd backend
pytest tests/

need more details?

pytest -v tests/

every failed assertion spits the full traceback—if something’s broken, you’ll know.

utility structure & fixtures

all test helpers live in: backend/tests/conftest.py

includes:

dummy prompt builders
temp memory inserts
mocked responses for failure cases
config overrides (fake tokens, alt models)

you can globally patch anything—registry keys, vectorizer, even agent triggers—to simulate weird edge cases.

test snippet

basic inference test (tests/test_infer.py):

from backend.core.registry import registry

def test_inference_response():
    prompt = "explain zk-rollups"
    model = registry.get("model")
    output = model(prompt)
    assert "rollup" in output.lower()

note: no mocks. brainz tests run the real model pipeline. if your model setup’s borked, the test will tell you.

writing your own tests

drop new ones under /tests/, same pattern:

import what you need (core/, models/, agents/, api/)
write clean asserts
run pytest before committing

want to be fancy? patch memory + agents mid-test to simulate live system chaos.

ci integration (coming soon)

full github actions workflow planned:

name: brainz tests
on: [push]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: setup python
        uses: actions/setup-python@v2
        with:
          python-version: '3.10'
      - name: install deps
        run: pip install -r backend/requirements.txt
      - name: run pytest
        run: pytest backend/tests/

add .test.env for isolated configs. mock outbound model calls if you don’t want to burn gpu cycles on ci.

test philosophy

if it breaks, you see it now, not in prod.

small, focused tests > giant integration monsters
live-run > mocks whenever possible
memory lifecycle testing is mandatory
agent triggers will get their own simulation suite soon

PreviousExtending the System NextLicense & Open Source

Last updated 17 days ago