Symbolic-AI engine + DSL. Composes the noesis stack as building blocks.
Find a file
Hannes Lehmann 6e6078b8dd conductor-repl: occasional-chatbot behaviour + conversation history
The bot now behaves like a real chat partner, not a learning machine:
small talk is answered freely, recognized topics are answered only from
what it knows (no bluffing), and facts are learned only when the user
actually asserts one. A new small-talk rule routes any unrecognized
message straight to the articulator.

Conversation history is now first-class and fed to BOTH glue edges:
  - extraction (NL -> facts): so a follow-up like "it is the same as AfA"
    resolves "it" from the prior turn before proposing a triple.
  - articulation (facts -> NL): so answers and small talk stay coherent
    across turns.
This is also the state the semantic layer can later read to understand
follow-ups and overall conversational context.

tools/llm: extract a reusable llm.Chat(ctx, Config, []Message) from
SayTool.Execute. SayTool keeps identical behaviour (tests unchanged);
the REPL uses Chat directly to assemble history-aware message lists for
both extraction and a history-aware /llm_say articulation tool.

Verified end to end: small talk -> ask (refuse) -> "it is the same as AfA"
(history resolves "it", learns synonym_of) -> recall -> persisted.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 17:18:46 +02:00
agent Rename examples/agent -> examples/klarheit; fix dropped agent/ files 2026-05-08 22:13:38 +02:00
conductor Add conductor: wire learn + reason + runtime into one loop 2026-06-16 16:19:21 +02:00
engine Add reason package: per-turn routing explanation (proof + near-miss) 2026-06-16 16:10:44 +02:00
examples conductor-repl: occasional-chatbot behaviour + conversation history 2026-06-16 17:18:46 +02:00
learn Initial commit: noematik symbolic-AI engine 2026-04-29 22:26:07 +02:00
plugins Add Korel-style PMI observer plugin and corpus-mining demo 2026-04-29 22:36:30 +02:00
reason Add reason package: per-turn routing explanation (proof + near-miss) 2026-06-16 16:10:44 +02:00
runtime Add conductor: wire learn + reason + runtime into one loop 2026-06-16 16:19:21 +02:00
tools/llm conductor-repl: occasional-chatbot behaviour + conversation history 2026-06-16 17:18:46 +02:00
triple Add triple package: LLM-content -> deterministic-Datalog seam 2026-06-16 15:28:08 +02:00
.gitignore Scope B: route chat fallback through qwen35-9b (tools/llm + .env config) 2026-05-08 22:57:07 +02:00
ARCHITECTURE.md Rename examples/agent -> examples/klarheit; fix dropped agent/ files 2026-05-08 22:13:38 +02:00
DEVLOG.md Scope B: route chat fallback through qwen35-9b (tools/llm + .env config) 2026-05-08 22:57:07 +02:00
go.mod Scope B: route chat fallback through qwen35-9b (tools/llm + .env config) 2026-05-08 22:57:07 +02:00
go.sum Scope B: route chat fallback through qwen35-9b (tools/llm + .env config) 2026-05-08 22:57:07 +02:00
LICENSE Initial commit: noematik symbolic-AI engine 2026-04-29 22:26:07 +02:00
README.md Rename examples/agent -> examples/klarheit; fix dropped agent/ files 2026-05-08 22:13:38 +02:00

noematik

A symbolic-AI engine for agent harnesses. Skills are Datalog files, tools are Go plugins, and every rule set is validated brutally before it ever runs a query.

The contract: a rule set either loads cleanly through every validation gate, or it does not load at all. There is no "loaded but inconsistent" state. Once loaded, the engine is read-only and concurrency-safe.

Why this exists

Agent harnesses today encode their decision logic in procedural code: hardcoded if/else over strings, scattered permission checks, ad-hoc plan selection. That is fine when humans write the harness. It is not fine when agents write the rules — which is the direction this project is built for. LLMs write declarative facts and rules reliably; they write procedural state machines unreliably. Make the rule layer declarative, validate it brutally, restrict the procedural surface to a small set of vetted plugins.

A rule set that loads is a rule set that is provably internally consistent. No "passes the test, fails in production" gap. Bugs in rules are syntactic or semantic and detectable at load time.

Architecture

                              ┌─────────────────────────┐
                              │  agent (cross-cut layer)│
                              │  Session, RunTurn,      │
                              │  memory subsystem,      │
                              │  Scheduler              │
                              └────┬────────────────┬───┘
                                   │ uses           │ uses
                ┌──────────────────▼───┐    ┌──────▼─────────┐
                │       engine         │    │    runtime     │
                │  Mangle + validation │    │  Tool + Plugin │
                │  Program / Atom /    │    │  Run / Hydrate │
                │  Fact / Provenance   │    │                │
                └──────────────────────┘    └────────────────┘
                            ▲                       ▲
                            │                       │
                       ┌────┴────┐             ┌────┴─────┐
                       │  learn  │             │ plugins  │
                       │ Trainer │             │ semantic │
                       │ + Valid │             │   pmi    │
                       │ filter  │             │          │
                       └─────────┘             └──────────┘

engine and runtime are pure primitives — engine knows facts, rules, types, and the validation pipeline; runtime knows Tools, Plugins, and the per-task loop. agent is the assembly layer above both, where stateful concerns live (Session, memory tiers) and where chat-style entry points (RunTurn) compose engine queries with runtime dispatch. Anything that touches the real world (LLMs, embeddings, file I/O, network) is a Tool or a Plugin.

Validation pipeline

Every rule set passes through these gates on engine.Load. Failure at any stage returns a typed LoadError{Stage, Cause} and no Program is returned.

Stage Built by Catches
parse Mangle Syntax errors with source position
analyze Mangle Decl well-formedness, undeclared predicates
stratify Mangle (surfaced from eval) Negation cycles
evaluate Mangle Bottom-up materialization of derived facts
typecheck noematik Decl bounds enforced against materialized facts

Mangle's analysis pass validates that bound-decls are well-formed but does not enforce them against facts; noematik's typecheck stage closes that gap by running the type checker against every materialized fact.

Packages

engine/        Mangle wrapper, validation, Load+Extend+WithFacts+Atoms,
               Save+LoadFile, Provenance
agent/         Cross-cut assembly: Session (stateful façade), full
               memory subsystem (Compressor / Reaper / LoopDetector /
               MemoryTiers + Consolidator), Scheduler
runtime/       Tool + Plugin registries, per-task loop (Run, Hydrate),
               audit trail
learn/         Trainer, Observer/Proposer/Reviewer, validator-as-filter
plugins/
  semantic/    Adapter for ~/work/semantic-layer — hydrates matched/2
  pmi/         Korel-style PMI observer — mines co-occurrence rules
                from a corpus of token sequences
examples/
  kinship/        Classical ILP — Trainer learns grandparent rule
  routing/        3 skills × 8 tasks — accept/reject verdict matrix
  klarheit/       Klarheit skill end-to-end — routing + plan + tool loop
  parity/         Bit-identical output check vs ichiban Prolog baseline
  chat/           Multi-turn Session with conversation history as facts
  pmi/            PMI plugin against a small synthetic corpus
  session-learn/  Trainer mining patterns from a synthetic Session log
  memory/         Compressor + Reaper + LoopDetector + Scheduler combined
  memory-tiers/   Short-term / long-term split via MemoryTiers
                  (decay-driven short, persisted long, with consolidation)

Memory subsystem

Long-running agents accumulate facts across turns. Without bounds the source text grows linearly, re-eval slows down, and the agent ends up reasoning over stale signals from twenty minutes ago. The memory subsystem is five primitives that together turn a Session into a working/long-term split with explicit lifecycles.

Primitive Where Job
agent.Session agent/session.go Stateful façade over an evolving Program; tracks turn counter, addedAt, lastTouched, touchCount per fact
agent.Compressor agent/compress.go Replaces raw turn-facts older than PreserveLastN with a per-turn compressed_episode/3 summary
agent.Reaper agent/decay.go Drops base facts that exceed MaxAge (calendar) or MaxIdle (since last touch); per-predicate DecayRules combine with OR
agent.LoopDetector agent/loop.go Pure query over executed/4 audit; flags (Tool,Args) clusters that repeat in MinRepeats distinct turns within Window
agent.MemoryTiers + Consolidator agent/tiers.go Bundles two Sessions; Consolidate promotes short-term facts that meet MinAge AND MinTouches AND predicate filter to long-term
agent.Scheduler agent/scheduler.go Runs IntervalPlugins (runtime.Plugin + Interval()) on tickers, folds output into a bound Session via AddFacts

Two design rules hold throughout:

  1. Validation-first ordering. Promote-before-remove, filter-then-Load, long-tier-Add before short-tier-RemoveFacts. A validator rejection never leaves a half-mutated session.
  2. Only base facts decay. Derived facts have no addedAt metadata and would be re-derived on the next eval anyway, so the Reaper skips them by design.

Persistence: prog.Save(path) writes a Program's source to disk; engine.LoadFile(path) re-loads it through the full validation pipeline. The memory-tiers example uses this to drop short-term and revive long-term across process boundaries.

Working examples

Routing (multi-skill verdict)

cd ~/work/noematik
go run ./examples/routing/

Loads research_agent, coding_agent, klarheit_agent plus 8 tasks with pre-computed matched/2 facts. Prints which skill accepts each task with reasons; warns on multi-match; surfaces silent reject and would-have-been-accepted-but-blocker cases.

Klarheit (full agent loop)

go run ./examples/klarheit/

Klarheit skill, three tasks, real tool dispatch:

  • /t1 ("Begriffe definieren"): two-tool plan, completes in 2 turns
  • /t2 ("Annahme offenlegen"): one-tool plan, completes in 1 turn
  • /t3 (klarheit signal but /implement blocker): runtime refuses to start because no skill accepts

Parity (vs cognicore/agent ichiban Prolog)

go test ./examples/parity/

7 input combinations × 5 plan rules. Verifies that noematik's klarheit port produces the same plan tuples as cognicore's klarheit.pl, including the negation case (begriff + struktur → 0 plans because of !matched(/struktur) blocker).

Learning (Kinship ILP)

go run ./examples/kinship/

Trainer iterates over five candidate rules for grandparent/2. The correct rule is adopted; an over-generalization is reviewer-vetoed (would derive forbidden facts); broken syntax is parser-rejected; an unstratifiable cycle is engine-rejected; an irrelevant fact is reviewer-vetoed for not adding marginal coverage.

API surface

// Loading and validation
prog, err := engine.Load(source)
prog2, err := prog.Extend(rulesSource)              // add rules; full re-validation
prog3, err := prog.WithFacts([]engine.Fact{...})    // add facts; types enforced

// Typed query
atoms, err := prog.Atoms("predicate")
for _, a := range atoms {
    if v, ok := a.Args[0].Atom(); ok {
        // /atom-name
    }
    if v, ok := a.Args[1].Text(); ok {
        // raw UTF-8 string, no escapes
    }
    if v, ok := a.Args[2].Int(); ok {
        // int64
    }
    if list, ok := a.Args[3].List(); ok {
        // []Term
    }
}

// Convenience
strings, err := prog.Query("predicate")   // canonical string form per atom

// Runtime
rt := &runtime.Runtime{
    Registry:       runtime.NewRegistry(),         // tools
    PluginRegistry: runtime.NewPluginRegistry(),   // optional
    MaxTurns:       10,
}
rt.Registry.Register(myTool)
rt.PluginRegistry.Register(mySemantic)

prog, err := engine.Load(skillSource)
res, err := rt.Hydrate(ctx, prog, runtime.PluginInput{TaskID: "/t1", Text: "..."}, nil)
result, err := rt.Run(ctx, res.Final, "/t1")

// Learning
tr := &learn.Trainer{Observer: ..., Proposer: ..., Reviewer: ...}
nextProg, outcomes, err := tr.Step(prog)

Skill file conventions

A skill is a .mg file. Datalog with explicit Decl for every predicate the runtime might assert. Convention:

# Always-declared predicates (the agent runtime contract)
Decl task(T).
Decl matched(T, Term).
Decl executed(T, Tool, Args, Result).
Decl tool_used(T, Tool).
Decl next_action(T, Tool, Args).
Decl complete(T).

# Routing
Decl match_signal(Skill, T).
Decl match_blocker(Skill, T).
Decl accepts(Skill, T).
Decl accept_reason(Skill, T, Reason).

# Aux predicates for stratified-negation safety
tool_used(T, Tool) :- executed(T, Tool, _, _).

# Skill-specific routing rules
match_signal(/my_skill, T) :- matched(T, /keyword).
accepts(/my_skill, T) :-
    match_signal(/my_skill, T),
    !match_blocker(/my_skill, T).

# Skill-specific planning rules
next_action(T, /tool_x, [/arg]) :-
    accepts(/my_skill, T),
    !tool_used(T, /tool_x).

complete(T) :-
    accepts(/my_skill, T),
    executed(T, /tool_x, _, /ok).

Mangle disallows unbound variables under negation, so the tool_used/2 aux predicate is required to express "tool has not been used yet" without reaching for _ placeholders.

Test status

engine            86.9% statement coverage  race-clean
agent             88.6% statement coverage  race-clean
learn             97.4% statement coverage  race-clean
runtime           94.7% statement coverage  race-clean
plugins/semantic  92.6% statement coverage  race-clean
plugins/pmi       90.0% statement coverage  race-clean
examples/parity   passes 7/7 fixtures vs cognicore/agent baseline

Performance baseline (Ryzen 5 3400G):

Operation Time Allocations
Load (small program) 114µs 474
Query (recursive transitive closure) 1.3µs 10
WithFacts (1 fact) 75µs 233
WithFacts (50 facts) 1.0ms 2973

Production consumer

~/work/cognicore/agent migrated its planning engine from ichiban Prolog to noematik. Output is bit-identical on all 7 parity fixtures. The migration revealed one API gap (Diagnose needs a lastView cache because Programs are immutable) — that gap has since been closed by engine.Session, which holds the evolving Program plus turn-scoped metadata behind a single read/write-locked façade.

License

MIT.