The bot now behaves like a real chat partner, not a learning machine:
small talk is answered freely, recognized topics are answered only from
what it knows (no bluffing), and facts are learned only when the user
actually asserts one. A new small-talk rule routes any unrecognized
message straight to the articulator.
Conversation history is now first-class and fed to BOTH glue edges:
- extraction (NL -> facts): so a follow-up like "it is the same as AfA"
resolves "it" from the prior turn before proposing a triple.
- articulation (facts -> NL): so answers and small talk stay coherent
across turns.
This is also the state the semantic layer can later read to understand
follow-ups and overall conversational context.
tools/llm: extract a reusable llm.Chat(ctx, Config, []Message) from
SayTool.Execute. SayTool keeps identical behaviour (tests unchanged);
the REPL uses Chat directly to assemble history-aware message lists for
both extraction and a history-aware /llm_say articulation tool.
Verified end to end: small talk -> ask (refuse) -> "it is the same as AfA"
(history resolves "it", learns synonym_of) -> recall -> persisted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| agent | ||
| conductor | ||
| engine | ||
| examples | ||
| learn | ||
| plugins | ||
| reason | ||
| runtime | ||
| tools/llm | ||
| triple | ||
| .gitignore | ||
| ARCHITECTURE.md | ||
| DEVLOG.md | ||
| go.mod | ||
| go.sum | ||
| LICENSE | ||
| README.md | ||
noematik
A symbolic-AI engine for agent harnesses. Skills are Datalog files, tools are Go plugins, and every rule set is validated brutally before it ever runs a query.
The contract: a rule set either loads cleanly through every validation gate, or it does not load at all. There is no "loaded but inconsistent" state. Once loaded, the engine is read-only and concurrency-safe.
Why this exists
Agent harnesses today encode their decision logic in procedural code:
hardcoded if/else over strings, scattered permission checks, ad-hoc
plan selection. That is fine when humans write the harness. It is not
fine when agents write the rules — which is the direction this
project is built for. LLMs write declarative facts and rules reliably;
they write procedural state machines unreliably. Make the rule layer
declarative, validate it brutally, restrict the procedural surface to
a small set of vetted plugins.
A rule set that loads is a rule set that is provably internally consistent. No "passes the test, fails in production" gap. Bugs in rules are syntactic or semantic and detectable at load time.
Architecture
┌─────────────────────────┐
│ agent (cross-cut layer)│
│ Session, RunTurn, │
│ memory subsystem, │
│ Scheduler │
└────┬────────────────┬───┘
│ uses │ uses
┌──────────────────▼───┐ ┌──────▼─────────┐
│ engine │ │ runtime │
│ Mangle + validation │ │ Tool + Plugin │
│ Program / Atom / │ │ Run / Hydrate │
│ Fact / Provenance │ │ │
└──────────────────────┘ └────────────────┘
▲ ▲
│ │
┌────┴────┐ ┌────┴─────┐
│ learn │ │ plugins │
│ Trainer │ │ semantic │
│ + Valid │ │ pmi │
│ filter │ │ │
└─────────┘ └──────────┘
engine and runtime are pure primitives — engine knows facts,
rules, types, and the validation pipeline; runtime knows Tools,
Plugins, and the per-task loop. agent is the assembly layer
above both, where stateful concerns live (Session, memory tiers)
and where chat-style entry points (RunTurn) compose engine
queries with runtime dispatch. Anything that touches the real
world (LLMs, embeddings, file I/O, network) is a Tool or a Plugin.
Validation pipeline
Every rule set passes through these gates on engine.Load. Failure
at any stage returns a typed LoadError{Stage, Cause} and no
Program is returned.
| Stage | Built by | Catches |
|---|---|---|
parse |
Mangle | Syntax errors with source position |
analyze |
Mangle | Decl well-formedness, undeclared predicates |
stratify |
Mangle (surfaced from eval) | Negation cycles |
evaluate |
Mangle | Bottom-up materialization of derived facts |
typecheck |
noematik | Decl bounds enforced against materialized facts |
Mangle's analysis pass validates that bound-decls are well-formed but
does not enforce them against facts; noematik's typecheck stage
closes that gap by running the type checker against every materialized
fact.
Packages
engine/ Mangle wrapper, validation, Load+Extend+WithFacts+Atoms,
Save+LoadFile, Provenance
agent/ Cross-cut assembly: Session (stateful façade), full
memory subsystem (Compressor / Reaper / LoopDetector /
MemoryTiers + Consolidator), Scheduler
runtime/ Tool + Plugin registries, per-task loop (Run, Hydrate),
audit trail
learn/ Trainer, Observer/Proposer/Reviewer, validator-as-filter
plugins/
semantic/ Adapter for ~/work/semantic-layer — hydrates matched/2
pmi/ Korel-style PMI observer — mines co-occurrence rules
from a corpus of token sequences
examples/
kinship/ Classical ILP — Trainer learns grandparent rule
routing/ 3 skills × 8 tasks — accept/reject verdict matrix
klarheit/ Klarheit skill end-to-end — routing + plan + tool loop
parity/ Bit-identical output check vs ichiban Prolog baseline
chat/ Multi-turn Session with conversation history as facts
pmi/ PMI plugin against a small synthetic corpus
session-learn/ Trainer mining patterns from a synthetic Session log
memory/ Compressor + Reaper + LoopDetector + Scheduler combined
memory-tiers/ Short-term / long-term split via MemoryTiers
(decay-driven short, persisted long, with consolidation)
Memory subsystem
Long-running agents accumulate facts across turns. Without bounds
the source text grows linearly, re-eval slows down, and the agent
ends up reasoning over stale signals from twenty minutes ago. The
memory subsystem is five primitives that together turn a Session
into a working/long-term split with explicit lifecycles.
| Primitive | Where | Job |
|---|---|---|
agent.Session |
agent/session.go |
Stateful façade over an evolving Program; tracks turn counter, addedAt, lastTouched, touchCount per fact |
agent.Compressor |
agent/compress.go |
Replaces raw turn-facts older than PreserveLastN with a per-turn compressed_episode/3 summary |
agent.Reaper |
agent/decay.go |
Drops base facts that exceed MaxAge (calendar) or MaxIdle (since last touch); per-predicate DecayRules combine with OR |
agent.LoopDetector |
agent/loop.go |
Pure query over executed/4 audit; flags (Tool,Args) clusters that repeat in MinRepeats distinct turns within Window |
agent.MemoryTiers + Consolidator |
agent/tiers.go |
Bundles two Sessions; Consolidate promotes short-term facts that meet MinAge AND MinTouches AND predicate filter to long-term |
agent.Scheduler |
agent/scheduler.go |
Runs IntervalPlugins (runtime.Plugin + Interval()) on tickers, folds output into a bound Session via AddFacts |
Two design rules hold throughout:
- Validation-first ordering. Promote-before-remove, filter-then-Load, long-tier-Add before short-tier-RemoveFacts. A validator rejection never leaves a half-mutated session.
- Only base facts decay. Derived facts have no
addedAtmetadata and would be re-derived on the next eval anyway, so the Reaper skips them by design.
Persistence: prog.Save(path) writes a Program's source to disk;
engine.LoadFile(path) re-loads it through the full validation pipeline.
The memory-tiers example uses this to drop short-term and revive
long-term across process boundaries.
Working examples
Routing (multi-skill verdict)
cd ~/work/noematik
go run ./examples/routing/
Loads research_agent, coding_agent, klarheit_agent plus 8 tasks
with pre-computed matched/2 facts. Prints which skill accepts each
task with reasons; warns on multi-match; surfaces silent reject and
would-have-been-accepted-but-blocker cases.
Klarheit (full agent loop)
go run ./examples/klarheit/
Klarheit skill, three tasks, real tool dispatch:
- /t1 ("Begriffe definieren"): two-tool plan, completes in 2 turns
- /t2 ("Annahme offenlegen"): one-tool plan, completes in 1 turn
- /t3 (klarheit signal but
/implementblocker): runtime refuses to start because no skill accepts
Parity (vs cognicore/agent ichiban Prolog)
go test ./examples/parity/
7 input combinations × 5 plan rules. Verifies that noematik's klarheit
port produces the same plan tuples as cognicore's klarheit.pl, including
the negation case (begriff + struktur → 0 plans because of
!matched(/struktur) blocker).
Learning (Kinship ILP)
go run ./examples/kinship/
Trainer iterates over five candidate rules for grandparent/2. The
correct rule is adopted; an over-generalization is reviewer-vetoed
(would derive forbidden facts); broken syntax is parser-rejected; an
unstratifiable cycle is engine-rejected; an irrelevant fact is
reviewer-vetoed for not adding marginal coverage.
API surface
// Loading and validation
prog, err := engine.Load(source)
prog2, err := prog.Extend(rulesSource) // add rules; full re-validation
prog3, err := prog.WithFacts([]engine.Fact{...}) // add facts; types enforced
// Typed query
atoms, err := prog.Atoms("predicate")
for _, a := range atoms {
if v, ok := a.Args[0].Atom(); ok {
// /atom-name
}
if v, ok := a.Args[1].Text(); ok {
// raw UTF-8 string, no escapes
}
if v, ok := a.Args[2].Int(); ok {
// int64
}
if list, ok := a.Args[3].List(); ok {
// []Term
}
}
// Convenience
strings, err := prog.Query("predicate") // canonical string form per atom
// Runtime
rt := &runtime.Runtime{
Registry: runtime.NewRegistry(), // tools
PluginRegistry: runtime.NewPluginRegistry(), // optional
MaxTurns: 10,
}
rt.Registry.Register(myTool)
rt.PluginRegistry.Register(mySemantic)
prog, err := engine.Load(skillSource)
res, err := rt.Hydrate(ctx, prog, runtime.PluginInput{TaskID: "/t1", Text: "..."}, nil)
result, err := rt.Run(ctx, res.Final, "/t1")
// Learning
tr := &learn.Trainer{Observer: ..., Proposer: ..., Reviewer: ...}
nextProg, outcomes, err := tr.Step(prog)
Skill file conventions
A skill is a .mg file. Datalog with explicit Decl for every
predicate the runtime might assert. Convention:
# Always-declared predicates (the agent runtime contract)
Decl task(T).
Decl matched(T, Term).
Decl executed(T, Tool, Args, Result).
Decl tool_used(T, Tool).
Decl next_action(T, Tool, Args).
Decl complete(T).
# Routing
Decl match_signal(Skill, T).
Decl match_blocker(Skill, T).
Decl accepts(Skill, T).
Decl accept_reason(Skill, T, Reason).
# Aux predicates for stratified-negation safety
tool_used(T, Tool) :- executed(T, Tool, _, _).
# Skill-specific routing rules
match_signal(/my_skill, T) :- matched(T, /keyword).
accepts(/my_skill, T) :-
match_signal(/my_skill, T),
!match_blocker(/my_skill, T).
# Skill-specific planning rules
next_action(T, /tool_x, [/arg]) :-
accepts(/my_skill, T),
!tool_used(T, /tool_x).
complete(T) :-
accepts(/my_skill, T),
executed(T, /tool_x, _, /ok).
Mangle disallows unbound variables under negation, so the
tool_used/2 aux predicate is required to express "tool has not
been used yet" without reaching for _ placeholders.
Test status
engine 86.9% statement coverage race-clean
agent 88.6% statement coverage race-clean
learn 97.4% statement coverage race-clean
runtime 94.7% statement coverage race-clean
plugins/semantic 92.6% statement coverage race-clean
plugins/pmi 90.0% statement coverage race-clean
examples/parity passes 7/7 fixtures vs cognicore/agent baseline
Performance baseline (Ryzen 5 3400G):
| Operation | Time | Allocations |
|---|---|---|
Load (small program) |
114µs | 474 |
Query (recursive transitive closure) |
1.3µs | 10 |
WithFacts (1 fact) |
75µs | 233 |
WithFacts (50 facts) |
1.0ms | 2973 |
Production consumer
~/work/cognicore/agent migrated its planning engine from ichiban
Prolog to noematik. Output is bit-identical on all 7 parity fixtures.
The migration revealed one API gap (Diagnose needs a lastView cache
because Programs are immutable) — that gap has since been closed by
engine.Session, which holds the evolving Program plus turn-scoped
metadata behind a single read/write-locked façade.
License
MIT.