Symbolic-AI engine + DSL. Composes the noesis stack as building blocks.

Find a file

Hannes Lehmann 6e6078b8dd conductor-repl: occasional-chatbot behaviour + conversation history The bot now behaves like a real chat partner, not a learning machine: small talk is answered freely, recognized topics are answered only from what it knows (no bluffing), and facts are learned only when the user actually asserts one. A new small-talk rule routes any unrecognized message straight to the articulator. Conversation history is now first-class and fed to BOTH glue edges: - extraction (NL -> facts): so a follow-up like "it is the same as AfA" resolves "it" from the prior turn before proposing a triple. - articulation (facts -> NL): so answers and small talk stay coherent across turns. This is also the state the semantic layer can later read to understand follow-ups and overall conversational context. tools/llm: extract a reusable llm.Chat(ctx, Config, []Message) from SayTool.Execute. SayTool keeps identical behaviour (tests unchanged); the REPL uses Chat directly to assemble history-aware message lists for both extraction and a history-aware /llm_say articulation tool. Verified end to end: small talk -> ask (refuse) -> "it is the same as AfA" (history resolves "it", learns synonym_of) -> recall -> persisted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>		2026-06-16 17:18:46 +02:00
agent	Rename examples/agent -> examples/klarheit; fix dropped agent/ files	2026-05-08 22:13:38 +02:00
conductor	Add conductor: wire learn + reason + runtime into one loop	2026-06-16 16:19:21 +02:00
engine	Add reason package: per-turn routing explanation (proof + near-miss)	2026-06-16 16:10:44 +02:00
examples	conductor-repl: occasional-chatbot behaviour + conversation history	2026-06-16 17:18:46 +02:00
learn	Initial commit: noematik symbolic-AI engine	2026-04-29 22:26:07 +02:00
plugins	Add Korel-style PMI observer plugin and corpus-mining demo	2026-04-29 22:36:30 +02:00
reason	Add reason package: per-turn routing explanation (proof + near-miss)	2026-06-16 16:10:44 +02:00
runtime	Add conductor: wire learn + reason + runtime into one loop	2026-06-16 16:19:21 +02:00
tools/llm	conductor-repl: occasional-chatbot behaviour + conversation history	2026-06-16 17:18:46 +02:00
triple	Add triple package: LLM-content -> deterministic-Datalog seam	2026-06-16 15:28:08 +02:00
.gitignore	Scope B: route chat fallback through qwen35-9b (tools/llm + .env config)	2026-05-08 22:57:07 +02:00
ARCHITECTURE.md	Rename examples/agent -> examples/klarheit; fix dropped agent/ files	2026-05-08 22:13:38 +02:00
DEVLOG.md	Scope B: route chat fallback through qwen35-9b (tools/llm + .env config)	2026-05-08 22:57:07 +02:00
go.mod	Scope B: route chat fallback through qwen35-9b (tools/llm + .env config)	2026-05-08 22:57:07 +02:00
go.sum	Scope B: route chat fallback through qwen35-9b (tools/llm + .env config)	2026-05-08 22:57:07 +02:00
LICENSE	Initial commit: noematik symbolic-AI engine	2026-04-29 22:26:07 +02:00
README.md	Rename examples/agent -> examples/klarheit; fix dropped agent/ files	2026-05-08 22:13:38 +02:00

README.md

noematik

A symbolic-AI engine for agent harnesses. Skills are Datalog files, tools are Go plugins, and every rule set is validated brutally before it ever runs a query.

The contract: a rule set either loads cleanly through every validation gate, or it does not load at all. There is no "loaded but inconsistent" state. Once loaded, the engine is read-only and concurrency-safe.

Why this exists

Agent harnesses today encode their decision logic in procedural code: hardcoded if/else over strings, scattered permission checks, ad-hoc plan selection. That is fine when humans write the harness. It is not fine when agents write the rules — which is the direction this project is built for. LLMs write declarative facts and rules reliably; they write procedural state machines unreliably. Make the rule layer declarative, validate it brutally, restrict the procedural surface to a small set of vetted plugins.

A rule set that loads is a rule set that is provably internally consistent. No "passes the test, fails in production" gap. Bugs in rules are syntactic or semantic and detectable at load time.

Architecture

                              ┌─────────────────────────┐
                              │  agent (cross-cut layer)│
                              │  Session, RunTurn,      │
                              │  memory subsystem,      │
                              │  Scheduler              │
                              └────┬────────────────┬───┘
                                   │ uses           │ uses
                ┌──────────────────▼───┐    ┌──────▼─────────┐
                │       engine         │    │    runtime     │
                │  Mangle + validation │    │  Tool + Plugin │
                │  Program / Atom /    │    │  Run / Hydrate │
                │  Fact / Provenance   │    │                │
                └──────────────────────┘    └────────────────┘
                            ▲                       ▲
                            │                       │
                       ┌────┴────┐             ┌────┴─────┐
                       │  learn  │             │ plugins  │
                       │ Trainer │             │ semantic │
                       │ + Valid │             │   pmi    │
                       │ filter  │             │          │
                       └─────────┘             └──────────┘

engine and runtime are pure primitives — engine knows facts, rules, types, and the validation pipeline; runtime knows Tools, Plugins, and the per-task loop. agent is the assembly layer above both, where stateful concerns live (Session, memory tiers) and where chat-style entry points (RunTurn) compose engine queries with runtime dispatch. Anything that touches the real world (LLMs, embeddings, file I/O, network) is a Tool or a Plugin.

Validation pipeline

Every rule set passes through these gates on engine.Load. Failure at any stage returns a typed LoadError{Stage, Cause} and no Program is returned.

Stage	Built by	Catches
`parse`	Mangle	Syntax errors with source position
`analyze`	Mangle	Decl well-formedness, undeclared predicates
`stratify`	Mangle (surfaced from eval)	Negation cycles
`evaluate`	Mangle	Bottom-up materialization of derived facts
`typecheck`	noematik	Decl bounds enforced against materialized facts

Mangle's analysis pass validates that bound-decls are well-formed but does not enforce them against facts; noematik's typecheck stage closes that gap by running the type checker against every materialized fact.

Packages

engine/        Mangle wrapper, validation, Load+Extend+WithFacts+Atoms,
               Save+LoadFile, Provenance
agent/         Cross-cut assembly: Session (stateful façade), full
               memory subsystem (Compressor / Reaper / LoopDetector /
               MemoryTiers + Consolidator), Scheduler
runtime/       Tool + Plugin registries, per-task loop (Run, Hydrate),
               audit trail
learn/         Trainer, Observer/Proposer/Reviewer, validator-as-filter
plugins/
  semantic/    Adapter for ~/work/semantic-layer — hydrates matched/2
  pmi/         Korel-style PMI observer — mines co-occurrence rules
                from a corpus of token sequences
examples/
  kinship/        Classical ILP — Trainer learns grandparent rule
  routing/        3 skills × 8 tasks — accept/reject verdict matrix
  klarheit/       Klarheit skill end-to-end — routing + plan + tool loop
  parity/         Bit-identical output check vs ichiban Prolog baseline
  chat/           Multi-turn Session with conversation history as facts
  pmi/            PMI plugin against a small synthetic corpus
  session-learn/  Trainer mining patterns from a synthetic Session log
  memory/         Compressor + Reaper + LoopDetector + Scheduler combined
  memory-tiers/   Short-term / long-term split via MemoryTiers
                  (decay-driven short, persisted long, with consolidation)

Memory subsystem

Long-running agents accumulate facts across turns. Without bounds the source text grows linearly, re-eval slows down, and the agent ends up reasoning over stale signals from twenty minutes ago. The memory subsystem is five primitives that together turn a Session into a working/long-term split with explicit lifecycles.

Primitive	Where	Job
`agent.Session`	`agent/session.go`	Stateful façade over an evolving Program; tracks turn counter, addedAt, lastTouched, touchCount per fact
`agent.Compressor`	`agent/compress.go`	Replaces raw turn-facts older than `PreserveLastN` with a per-turn `compressed_episode/3` summary
`agent.Reaper`	`agent/decay.go`	Drops base facts that exceed `MaxAge` (calendar) or `MaxIdle` (since last touch); per-predicate `DecayRule`s combine with OR
`agent.LoopDetector`	`agent/loop.go`	Pure query over `executed/4` audit; flags `(Tool,Args)` clusters that repeat in `MinRepeats` distinct turns within `Window`
`agent.MemoryTiers` + `Consolidator`	`agent/tiers.go`	Bundles two Sessions; `Consolidate` promotes short-term facts that meet `MinAge` AND `MinTouches` AND predicate filter to long-term
`agent.Scheduler`	`agent/scheduler.go`	Runs `IntervalPlugin`s (`runtime.Plugin` + `Interval()`) on tickers, folds output into a bound Session via `AddFacts`

Two design rules hold throughout:

Validation-first ordering. Promote-before-remove, filter-then-Load, long-tier-Add before short-tier-RemoveFacts. A validator rejection never leaves a half-mutated session.
Only base facts decay. Derived facts have no addedAt metadata and would be re-derived on the next eval anyway, so the Reaper skips them by design.

Persistence: prog.Save(path) writes a Program's source to disk; engine.LoadFile(path) re-loads it through the full validation pipeline. The memory-tiers example uses this to drop short-term and revive long-term across process boundaries.

Working examples

Routing (multi-skill verdict)

cd ~/work/noematik
go run ./examples/routing/

Loads research_agent, coding_agent, klarheit_agent plus 8 tasks with pre-computed matched/2 facts. Prints which skill accepts each task with reasons; warns on multi-match; surfaces silent reject and would-have-been-accepted-but-blocker cases.

Klarheit (full agent loop)

go run ./examples/klarheit/

Klarheit skill, three tasks, real tool dispatch:

/t1 ("Begriffe definieren"): two-tool plan, completes in 2 turns
/t2 ("Annahme offenlegen"): one-tool plan, completes in 1 turn
/t3 (klarheit signal but /implement blocker): runtime refuses to start because no skill accepts

Parity (vs cognicore/agent ichiban Prolog)

go test ./examples/parity/

7 input combinations × 5 plan rules. Verifies that noematik's klarheit port produces the same plan tuples as cognicore's klarheit.pl, including the negation case (begriff + struktur → 0 plans because of !matched(/struktur) blocker).

Learning (Kinship ILP)

go run ./examples/kinship/

Trainer iterates over five candidate rules for grandparent/2. The correct rule is adopted; an over-generalization is reviewer-vetoed (would derive forbidden facts); broken syntax is parser-rejected; an unstratifiable cycle is engine-rejected; an irrelevant fact is reviewer-vetoed for not adding marginal coverage.

API surface

// Loading and validation
prog, err := engine.Load(source)
prog2, err := prog.Extend(rulesSource)              // add rules; full re-validation
prog3, err := prog.WithFacts([]engine.Fact{...})    // add facts; types enforced

// Typed query
atoms, err := prog.Atoms("predicate")
for _, a := range atoms {
    if v, ok := a.Args[0].Atom(); ok {
        // /atom-name
    }
    if v, ok := a.Args[1].Text(); ok {
        // raw UTF-8 string, no escapes
    }
    if v, ok := a.Args[2].Int(); ok {
        // int64
    }
    if list, ok := a.Args[3].List(); ok {
        // []Term
    }
}

// Convenience
strings, err := prog.Query("predicate")   // canonical string form per atom

// Runtime
rt := &runtime.Runtime{
    Registry:       runtime.NewRegistry(),         // tools
    PluginRegistry: runtime.NewPluginRegistry(),   // optional
    MaxTurns:       10,
}
rt.Registry.Register(myTool)
rt.PluginRegistry.Register(mySemantic)

prog, err := engine.Load(skillSource)
res, err := rt.Hydrate(ctx, prog, runtime.PluginInput{TaskID: "/t1", Text: "..."}, nil)
result, err := rt.Run(ctx, res.Final, "/t1")

// Learning
tr := &learn.Trainer{Observer: ..., Proposer: ..., Reviewer: ...}
nextProg, outcomes, err := tr.Step(prog)

Skill file conventions

A skill is a .mg file. Datalog with explicit Decl for every predicate the runtime might assert. Convention:

# Always-declared predicates (the agent runtime contract)
Decl task(T).
Decl matched(T, Term).
Decl executed(T, Tool, Args, Result).
Decl tool_used(T, Tool).
Decl next_action(T, Tool, Args).
Decl complete(T).

# Routing
Decl match_signal(Skill, T).
Decl match_blocker(Skill, T).
Decl accepts(Skill, T).
Decl accept_reason(Skill, T, Reason).

# Aux predicates for stratified-negation safety
tool_used(T, Tool) :- executed(T, Tool, _, _).

# Skill-specific routing rules
match_signal(/my_skill, T) :- matched(T, /keyword).
accepts(/my_skill, T) :-
    match_signal(/my_skill, T),
    !match_blocker(/my_skill, T).

# Skill-specific planning rules
next_action(T, /tool_x, [/arg]) :-
    accepts(/my_skill, T),
    !tool_used(T, /tool_x).

complete(T) :-
    accepts(/my_skill, T),
    executed(T, /tool_x, _, /ok).

Mangle disallows unbound variables under negation, so the tool_used/2 aux predicate is required to express "tool has not been used yet" without reaching for _ placeholders.

Test status

engine            86.9% statement coverage  race-clean
agent             88.6% statement coverage  race-clean
learn             97.4% statement coverage  race-clean
runtime           94.7% statement coverage  race-clean
plugins/semantic  92.6% statement coverage  race-clean
plugins/pmi       90.0% statement coverage  race-clean
examples/parity   passes 7/7 fixtures vs cognicore/agent baseline

Performance baseline (Ryzen 5 3400G):

Operation	Time	Allocations
`Load` (small program)	114µs	474
`Query` (recursive transitive closure)	1.3µs	10
`WithFacts` (1 fact)	75µs	233
`WithFacts` (50 facts)	1.0ms	2973

Production consumer

~/work/cognicore/agent migrated its planning engine from ichiban Prolog to noematik. Output is bit-identical on all 7 parity fixtures. The migration revealed one API gap (Diagnose needs a lastView cache because Programs are immutable) — that gap has since been closed by engine.Session, which holds the evolving Program plus turn-scoped metadata behind a single read/write-locked façade.

License

MIT.

README.md Unescape Escape