EphemeraAdmin: Add logging & training data documentation (implemented 2026-04-22)

2026-04-23T02:16:33Z

Add logging & training data documentation (implemented 2026-04-22)

New page

{{DISPLAYTITLE:Ephemera Agent — Logging & Training Data}}
__TOC__
Every agent call is recorded in full to a MariaDB table (<code>ephemera_agent_log</code>). This serves three purposes: operational auditability, debugging, and accumulation of real training data for eventual fine-tuning.

== Database Table ==

The log table lives in the same MariaDB instance as MediaWiki (database: <code>ephemera</code>).

CREATE TABLE ephemera_agent_log (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
session_id VARCHAR(64) NOT NULL,
timestamp DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
mode VARCHAR(32),
provider VARCHAR(32),
model VARCHAR(128),
system_prompt LONGTEXT,
context_assembled LONGTEXT,
user_turn LONGTEXT,
response LONGTEXT,
tool_calls JSON,
target_page VARCHAR(512),
outcome ENUM('used','modified','rejected','pending') DEFAULT 'pending',
training_flag TINYINT NOT NULL DEFAULT 0,
notes TEXT
);

=== Field Reference ===

{| class="wikitable"
! Field !! Notes
|-
| <code>session_id</code> || Generated per request via <code>uniqid()</code>; groups related calls in one job
|-
| <code>mode</code> || Classified task type (e.g. <code>create_encyclopedia_article</code>, <code>query</code>)
|-
| <code>provider</code> / <code>model</code> || LLM provider and exact model name used for the generator call
|-
| <code>system_prompt</code> || Full augmented system prompt including assembled context
|-
| <code>context_assembled</code> || Raw context string passed to the model
|-
| <code>user_turn</code> || The task string submitted by the user
|-
| <code>response</code> || Full raw JSON response from the generator
|-
| <code>tool_calls</code> || Extracted tool use blocks (Anthropic and OpenAI formats both handled)
|-
| <code>target_page</code> || First classified entity, used as the likely target wiki page
|-
| <code>outcome</code> || <code>pending</code> by default; manually set to <code>used</code>, <code>modified</code>, or <code>rejected</code> after review
|-
| <code>training_flag</code> || 0 = unassessed, 1 = good training candidate, −1 = exclude
|-
| <code>notes</code> || Human annotation field
|}

== How Logging Works ==

Logging is implemented in <code>agent.php</code>. After the generator LLM call completes and the response is decoded, <code>log_agent_call()</code> writes one row per agent invocation. A PDO connection with <code>ERRMODE_SILENT</code> is used so a database failure never surfaces to the user or breaks a task.

The call to <code>extract_tool_calls()</code> handles both Anthropic-style <code>content</code> blocks and OpenAI-style <code>choices[].message.tool_calls</code> arrays.

Logging happens in <code>agent.php</code> only — direct calls through <code>llm-proxy.php</code> are not logged.

== Querying Logs ==

Connect to MariaDB as the <code>ephemera</code> user:

mysql -u ephemera -p ephemera

Useful queries:

-- Most recent calls
SELECT id, timestamp, mode, model, target_page, outcome
FROM ephemera_agent_log ORDER BY id DESC LIMIT 20;

-- All unassessed rows
SELECT id, timestamp, mode, target_page FROM ephemera_agent_log
WHERE training_flag = 0 ORDER BY id DESC;

-- Flag a row as a good training example
UPDATE ephemera_agent_log SET training_flag = 1, outcome = 'used'
WHERE id = <id>;

== Passive Quality Signal ==

The <code>target_page</code> field enables a useful training signal: by joining <code>ephemera_agent_log</code> against MediaWiki's <code>revision</code> table, it is possible to diff what the agent wrote against subsequent human edits — capturing corrections as negative examples and unchanged output as positive signal.

== Retention Policy (Planned) ==

* Unflagged rows: eligible for pruning after a defined retention window (placeholder: 90 days)
* Rows with <code>training_flag = 1</code>: archived to JSONL before pruning
* Rows with <code>training_flag = −1</code>: pruned immediately on next cleanup pass
* Session summaries: preserved indefinitely in the <code>Logs:</code> namespace (not yet implemented)

== Fine-Tuning Roadmap ==

Fine-tuning is not planned until the system is mature and good data has accumulated naturally.

{| class="wikitable"
! Target !! Use Case !! Timeline
|-
| Format/structure LoRA || Reliable MediaWiki markup and template output || Medium-term
|-
| Domain knowledge LoRA || System-specific knowledge in weights || Long-term
|-
| Worldbuilding LoRA || Ephemera universe knowledge (separate agent) || Long-term
|}

Fireworks supports LoRA fine-tuning and serves fine-tuned models at base model prices.

[[Category:Help]]
[[Category:EphemeraAgent]]

Help:Ephemera Agent/Logging & Training Data - Revision history

EphemeraAdmin: Add logging & training data documentation (implemented 2026-04-22)