Help:Ephemera Agent/Logging & Training Data

From Encyclopedia Ephemera

Every agent call is recorded in full to a MariaDB table (ephemera_agent_log). This serves three purposes: operational auditability, debugging, and accumulation of real training data for eventual fine-tuning.

Database Table

The log table lives in the same MariaDB instance as MediaWiki (database: ephemera).

CREATE TABLE ephemera_agent_log (
  id                 BIGINT       NOT NULL AUTO_INCREMENT PRIMARY KEY,
  session_id         VARCHAR(64)  NOT NULL,
  timestamp          DATETIME     NOT NULL DEFAULT CURRENT_TIMESTAMP,
  mode               VARCHAR(32),
  provider           VARCHAR(32),
  model              VARCHAR(128),
  system_prompt      LONGTEXT,
  context_assembled  LONGTEXT,
  user_turn          LONGTEXT,
  response           LONGTEXT,
  tool_calls         JSON,
  target_page        VARCHAR(512),
  outcome            ENUM('used','modified','rejected','pending') DEFAULT 'pending',
  training_flag      TINYINT      NOT NULL DEFAULT 0,
  notes              TEXT
);

Field Reference

Field Notes
session_id Generated per request via uniqid(); groups related calls in one job
mode Classified task type (e.g. create_encyclopedia_article, query)
provider / model LLM provider and exact model name used for the generator call
system_prompt Full augmented system prompt including assembled context
context_assembled Raw context string passed to the model
user_turn The task string submitted by the user
response Full raw JSON response from the generator
tool_calls Extracted tool use blocks (Anthropic and OpenAI formats both handled)
target_page First classified entity, used as the likely target wiki page
outcome pending by default; manually set to used, modified, or rejected after review
training_flag 0 = unassessed, 1 = good training candidate, −1 = exclude
notes Human annotation field

How Logging Works

Logging is implemented in agent.php. After the generator LLM call completes and the response is decoded, log_agent_call() writes one row per agent invocation. A PDO connection with ERRMODE_SILENT is used so a database failure never surfaces to the user or breaks a task.

The call to extract_tool_calls() handles both Anthropic-style content blocks and OpenAI-style choices[].message.tool_calls arrays.

Logging happens in agent.php only — direct calls through llm-proxy.php are not logged.

Querying Logs

Connect to MariaDB as the ephemera user:

mysql -u ephemera -p ephemera

Useful queries:

-- Most recent calls
SELECT id, timestamp, mode, model, target_page, outcome
FROM ephemera_agent_log ORDER BY id DESC LIMIT 20;
-- All unassessed rows
SELECT id, timestamp, mode, target_page FROM ephemera_agent_log
WHERE training_flag = 0 ORDER BY id DESC;
-- Flag a row as a good training example
UPDATE ephemera_agent_log SET training_flag = 1, outcome = 'used'
WHERE id = <id>;

Passive Quality Signal

The target_page field enables a useful training signal: by joining ephemera_agent_log against MediaWiki's revision table, it is possible to diff what the agent wrote against subsequent human edits — capturing corrections as negative examples and unchanged output as positive signal.

Retention Policy (Planned)

  • Unflagged rows: eligible for pruning after a defined retention window (placeholder: 90 days)
  • Rows with training_flag = 1: archived to JSONL before pruning
  • Rows with training_flag = −1: pruned immediately on next cleanup pass
  • Session summaries: preserved indefinitely in the Logs: namespace (not yet implemented)

Fine-Tuning Roadmap

Fine-tuning is not planned until the system is mature and good data has accumulated naturally.

Target Use Case Timeline
Format/structure LoRA Reliable MediaWiki markup and template output Medium-term
Domain knowledge LoRA System-specific knowledge in weights Long-term
Worldbuilding LoRA Ephemera universe knowledge (separate agent) Long-term

Fireworks supports LoRA fine-tuning and serves fine-tuned models at base model prices.