Start with the reduction loop.

The entry point is simple: you already have a large array of turns, and you do not want to keep sending that whole thing into the next model call. Nocturnus reduces the working set first, then helps you keep later turns lean.

Default mental model. Your app owns the thread. Nocturnus owns the cut. That is the product story the docs are optimized around.

Start in four moves

Move 1. Send raw turns to POST /context — returns a structured briefing. Requires LLM (see note below).

curl -X POST http://localhost:9300/context \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: default" \
  -d '{
    "turns": [
      "User cannot log in after Okta cutover.",
      "CRM says enterprise tier and billing current.",
      "Audit log shows SAML issuer mismatch."
    ],
    "scope": "ticket-4821",
    "sessionId": "ticket-4821"
  }'

Move 2. Narrow the next decision with POST /memory/context and goals — returns only goal-relevant facts.

curl -X POST http://localhost:9300/memory/context \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: default" \
  -d '{
    "goals": [{"predicate": "next_best_action", "args": ["acme_corp"]}],
    "scope": "ticket-4821",
    "sessionId": "ticket-4821",
    "maxFacts": 10,
    "format": "natural"
  }'

Move 3. Reuse the sessionId from Move 1 with POST /context/diff on later turns — returns only what changed since the last window.

curl -X POST http://localhost:9300/context/diff \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: default" \
  -d '{"sessionId": "ticket-4821"}'

Move 4. Clear the session when the thread ends. Use the same sessionId from Move 1 so the server knows which diff state to discard.

curl -X POST http://localhost:9300/context/session/clear \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: default" \
  -d '{"sessionId": "ticket-4821"}'

Complete Example

Copy and paste this script to run all four moves end-to-end against a local server:

#!/bin/bash
# Complete NocturnusAI context workflow — all 4 moves
BASE=http://localhost:9300
TENANT=default
SESSION=ticket-4821

# Move 1: Send raw turns → get structured briefing (requires LLM)
curl -sS -X POST "$BASE/context" \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: $TENANT" \
  -d '{
    "turns": [
      "User cannot log in after Okta cutover.",
      "CRM says enterprise tier and billing current.",
      "Audit log shows SAML issuer mismatch."
    ],
    "scope": "'"$SESSION"'",
    "sessionId": "'"$SESSION"'"
  }'
echo ""

# Move 2: Goal-driven context narrowing
curl -sS -X POST "$BASE/memory/context" \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: $TENANT" \
  -d '{
    "goals": [{"predicate": "next_best_action", "args": ["acme_corp"]}],
    "scope": "'"$SESSION"'",
    "sessionId": "'"$SESSION"'",
    "maxFacts": 10,
    "format": "natural"
  }'
echo ""

# Move 3: Diff — only what changed since last window
curl -sS -X POST "$BASE/context/diff" \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: $TENANT" \
  -d '{"sessionId": "'"$SESSION"'"}'
echo ""

# Move 4: Clean up session when thread ends
curl -sS -X POST "$BASE/context/session/clear" \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: $TENANT" \
  -d '{"sessionId": "'"$SESSION"'"}'
echo ""

LLM extraction required for POST /context. The Docker image connects to Ollama on your host by default — just install Ollama and run ollama pull granite3.3:8b. Or pass -e ANTHROPIC_API_KEY to docker run for a cloud LLM. Without an LLM, start with POST /tell to assert facts directly, then use POST /memory/context — no LLM needed.

Where to go next

Workflow

Start with the reduction loop.

Start in four moves

Complete Example

Where to go next

See the real loop

Pick your surface

Read this second

Run it safely

What's Next?

Context Workflow →

API Reference →

SDKs and MCP →

How It Works →