DocsHow it works

How It Works

Quell has four stages: Read → Check → Verify → Write. Each stage is independent, reversible, and safe by default.

Stage 1: Read specs

Quell reads specifications that already exist in your codebase. Each spec reader returns a list of Requirement objects.

Docstring reader

Parses Google-style and plain docstrings:

  • Raises: blocks → MUST_RAISE requirements
  • Returns: blocks → MUST_RETURN requirements
  • Boundary phrases ("must be greater than 0", "cannot exceed") → BOUNDARY requirements
  • Enumeration phrases ("one of USD, EUR, GBP") → ENUM_VALID requirements

Type reader

Reads Pydantic models and type annotations:

  • Field(gt=0), Field(ge=18), Field(min_length=1)BOUNDARY requirements
  • Literal["USD", "EUR", "GBP"] fields → ENUM_VALID requirements
  • Function arguments with Literal type → ENUM_VALID requirements

Bug reader

Converts natural language bug descriptions into BUG_REPRO requirements using an LLM prompt. Used by quell reproduce.

Every spec reader returns [] on any error — they never raise exceptions.

Stage 2: Check coverage

An AST-based coverage checker scans your test files and marks each Requirement as covered or uncovered. No test execution required.

Coverage heuristics:

Requirement kindCovered if test file contains...
MUST_RAISEpytest.raises(ExceptionType)
BOUNDARYassertion with boundary constants (0, -1, 1)
ENUM_VALIDassertion referencing the enum values
MUST_RETURNassertion on the return value
BUG_REPROnever covered (always generates a test)

When in doubt, the checker marks a requirement as uncovered. Duplicate tests are cheaper than missed gaps.

Stage 3: Verify — The Moat

This is the most important stage. A test is only accepted if it satisfies both conditions:

  1. PASS on original code — run pytest <temp_test_file>. If this fails, the generated test is already broken.
  2. FAIL on violated code — inject a violation into the source, run pytest again. If the test still passes, it doesn't actually prove the requirement.

Both conditions are required. A test that passes both is verified and proceeds to the writer.

Violation injection

Quell injects minimal violations to trigger requirement failures:

KindViolation
MUST_RAISEComment out the raise statement
BOUNDARYWeaken the threshold to -9999
MUST_RETURNReplace the return with return None
BUG_REPRONo injection — the bug already exists

Isolation

Verification always runs in a subprocess (subprocess.run), never in-process. This ensures:

  • Violations load fresh (no module caching)
  • Failures are isolated (a crashing test doesn't kill Quell)
  • Timeouts are enforced cleanly

File safety

try:
    backup_source()
    inject_violation()
    run_pytest()
finally:
    restore_source()    # ALWAYS runs
    cleanup_temp()      # ALWAYS runs

The source file is always restored, even if verification crashes or times out.

Stage 4: Write

Verified tests are injected into the target test file using libcst — a lossless concrete syntax tree parser.

Unlike regex-based injection, libcst:

  • Preserves your existing comments, blank lines, and indentation
  • Validates the resulting source parses correctly before writing
  • Backs up the file before making any change
  • Restores on failure

The final write sequence:

  1. Parse existing test file with libcst
  2. Parse new test function with libcst
  3. Append the new function node
  4. Validate the combined source
  5. Write to disk

If step 4 or 5 fails, the backup is restored and Quell exits cleanly.

Diagnostic report

After every --fix run, Quell writes .quell/report.json — a privacy-safe diagnostic file that records where the rule engine succeeded, where it failed, and which argument types it couldn't stub.

{
  "quell_version": "0.4.4",
  "total_requirements": 79,
  "written": 41,
  "fails_on_correct": 15,
  "doesnt_catch_violation": 0,
  "skipped": 5,
  "unknown_type_frequency": {},
  "failure_reason_frequency": {
    "test_logic_incorrect": 15
  },
  "_note": "This report contains no source code or full paths. Safe to share with the Quell maintainer to improve the rule engine."
}

What IS recorded: function names, constraint kinds, verification outcome, unknown type annotations, aggregate counts.

What is NOT recorded: source code, function bodies, full file paths, or any data that could identify proprietary business logic.

Share this file with the Quell maintainer to improve rule engine coverage — each unknown_type_frequency entry tells us exactly which type stubs to add next.

Audit log

Every action Quell takes is appended to .quell/audit.jsonl as a structured JSON record:

{
  "timestamp": "2026-05-08T14:32:01Z",
  "requirement_id": "req_abc123",
  "action": "test_written",
  "file_path": "tests/test_payments.py",
  "test_function_name": "test_process_payment_must_raise_valueerror",
  "verification_status": "VERIFIED"
}

Design invariants

These invariants are enforced throughout the codebase and must never be broken:

  1. verifier.py — always restores source files in a finally block
  2. writer.py — always backs up before writing, always restores on failure
  3. writer.py — always validates CST parses correctly before writing to disk
  4. No code is sent to any server unless an LLM provider is configured
  5. LLM is only called for complex/unstructured specs — never for ones the rule engine handles
  6. Verification runs in a subprocess — never in-process
  7. Every spec reader returns [] on any error — never raises